Learning Log 4

R Markdown

data(women)
attach(women)
summary(women)

##      height         weight     
##  Min.   :58.0   Min.   :115.0  
##  1st Qu.:61.5   1st Qu.:124.5  
##  Median :65.0   Median :135.0  
##  Mean   :65.0   Mean   :136.7  
##  3rd Qu.:68.5   3rd Qu.:148.0  
##  Max.   :72.0   Max.   :164.0

mymod <- lm(weight~height, data= women)
mymod

## 
## Call:
## lm(formula = weight ~ height, data = women)
## 
## Coefficients:
## (Intercept)       height  
##      -87.52         3.45

We can find the correlation between weight and height

cor(weight, height)

## [1] 0.9954948

The correlation is very close to 1. Therfore, there is a strong realtionship bewteen the two.

Confidence/ Prediction Interval

Create a confidence interval:

confint(mymod)

##                   2.5 %     97.5 %
## (Intercept) -100.342655 -74.690679
## height         3.253112   3.646888

Create a prediction and confidence interval for weight when women’s height is 60

newdata<- data.frame(height=60) 
(predy <- predict(mymod, newdata, interval="predict") )

##        fit      lwr      upr
## 1 119.4833 115.9412 123.0255

(confy <- predict(mymod, newdata, interval="confidence") )

##        fit      lwr      upr
## 1 119.4833 118.1823 120.7844

This tells us which interval is wider:

confy %*% c(0, -1, 1)  #conf interval width

##       [,1]
## 1 2.602107

predy %*% c(0, -1, 1)  #pred interval width

##       [,1]
## 1 7.084336

The prediction interval only looks at one point, so it should be wider compared to the CI because the variance is larger for a single point than a mean.

They are centered at the same point:

confy[1] == predy[1]

## [1] TRUE

summary(mymod)

## 
## Call:
## lm(formula = weight ~ height, data = women)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7333 -1.1333 -0.3833  0.7417  3.1167 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -87.51667    5.93694  -14.74 1.71e-09 ***
## height        3.45000    0.09114   37.85 1.09e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.525 on 13 degrees of freedom
## Multiple R-squared:  0.991,  Adjusted R-squared:  0.9903 
## F-statistic:  1433 on 1 and 13 DF,  p-value: 1.091e-14

The summary gives us information like R^2, F stat, pvalue, etc.

Learning Log 4

John Talbot

2/8/2018

R Markdown

Confidence/ Prediction Interval