The given dataset has height and weight of 10 students. We would like to see if weight is having a linear relationship with height and would like to establish the regression model for weight prediction using Simple Linear Regression
dim(student)
## [1] 10 2
student
## ht wt
## 1 63 127
## 2 64 121
## 3 66 142
## 4 69 157
## 5 69 162
## 6 71 156
## 7 71 169
## 8 72 165
## 9 73 181
## 10 75 208
attach(student)
par(mfrow =c(2,2))
boxplot(wt, horizontal = TRUE, main="Boxplot of Wt")
boxplot(ht, horizontal = TRUE, main="Boxplot of Ht")
hist(wt)
hist(ht)
par(mfrow=c(1,1))
plot(ht,wt, col="Blue", main="Height Vs Weight")
cor.test(ht,wt)
##
## Pearson's product-moment correlation
##
## data: ht and wt
## t = 8.3466, df = 8, p-value = 3.214e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7864411 0.9877259
## sample estimates:
## cor
## 0.9470984
The plot shows the relationship between Height and Weight. From the Pearson’s correlation test, we see that 0.9470984. Thus we see that there is high positive correlation between height and weight.
Now let us establish a regression model to predict weight:
SLM <- lm(wt~ht, data=student)
summary(SLM)
##
## Call:
## lm(formula = wt ~ ht, data = student)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.2339 -4.0804 -0.0963 4.6445 14.2158
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -266.5344 51.0320 -5.223 8e-04 ***
## ht 6.1376 0.7353 8.347 3.21e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.641 on 8 degrees of freedom
## Multiple R-squared: 0.897, Adjusted R-squared: 0.8841
## F-statistic: 69.67 on 1 and 8 DF, p-value: 3.214e-05
anova(SLM)
## Analysis of Variance Table
##
## Response: wt
## Df Sum Sq Mean Sq F value Pr(>F)
## ht 1 5202.2 5202.2 69.666 3.214e-05 ***
## Residuals 8 597.4 74.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The regression plot is shown below:
plot(ht, wt, xlab="height", ylab="Weight", abline(lm(wt~ht),col=c("Blue")), main="Simple Linear Regression")