data(women)
attach(women)
summary(women)
## height weight
## Min. :58.0 Min. :115.0
## 1st Qu.:61.5 1st Qu.:124.5
## Median :65.0 Median :135.0
## Mean :65.0 Mean :136.7
## 3rd Qu.:68.5 3rd Qu.:148.0
## Max. :72.0 Max. :164.0
mymod <- lm(weight~height, data= women)
mymod
##
## Call:
## lm(formula = weight ~ height, data = women)
##
## Coefficients:
## (Intercept) height
## -87.52 3.45
We can find the correlation between weight and height
cor(weight, height)
## [1] 0.9954948
The correlation is very close to 1. Therfore, there is a strong realtionship bewteen the two.
Create a confidence interval:
confint(mymod)
## 2.5 % 97.5 %
## (Intercept) -100.342655 -74.690679
## height 3.253112 3.646888
Create a prediction and confidence interval for weight when women’s height is 60
newdata<- data.frame(height=60)
(predy <- predict(mymod, newdata, interval="predict") )
## fit lwr upr
## 1 119.4833 115.9412 123.0255
(confy <- predict(mymod, newdata, interval="confidence") )
## fit lwr upr
## 1 119.4833 118.1823 120.7844
This tells us which interval is wider:
confy %*% c(0, -1, 1) #conf interval width
## [,1]
## 1 2.602107
predy %*% c(0, -1, 1) #pred interval width
## [,1]
## 1 7.084336
The prediction interval only looks at one point, so it should be wider compared to the CI because the variance is larger for a single point than a mean.
They are centered at the same point:
confy[1] == predy[1]
## [1] TRUE
summary(mymod)
##
## Call:
## lm(formula = weight ~ height, data = women)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7333 -1.1333 -0.3833 0.7417 3.1167
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -87.51667 5.93694 -14.74 1.71e-09 ***
## height 3.45000 0.09114 37.85 1.09e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.525 on 13 degrees of freedom
## Multiple R-squared: 0.991, Adjusted R-squared: 0.9903
## F-statistic: 1433 on 1 and 13 DF, p-value: 1.091e-14
The summary gives us information like R^2, F stat, pvalue, etc.