mussel<-read.csv(url("http://cknudson.com/data/mussels.csv"))
names(mussel)
##  [1] "GroupID"    "dry.mass"   "count"      "attached"   "lipid"     
##  [6] "protein"    "carbo"      "ash"        "Kcal"       "ammonia"   
## [11] "O2"         "AvgAmmonia" "AvgO2"      "AvgMass"
attach(mussel)

I created a linear model below that predicts Avg amount of ammonia based on the Avg mass of the mussel and what the mussel is attached to (rock or amblema mussel).

modelm<-lm(AvgAmmonia ~ AvgMass+attached , mussel)
modelm
## 
## Call:
## lm(formula = AvgAmmonia ~ AvgMass + attached, data = mussel)
## 
## Coefficients:
##  (Intercept)       AvgMass  attachedRock  
##     0.001140      0.239279     -0.002563

Hypothesis Test

We want to test the significane of the slope estimator for Avg Mass. We want to see if there is a linear relationship between the average amount of ammonia and the average mass of a mussel. We will use the null hypotheses Ho: B1=0 (the slope estimate of avg mass is 0) and the alternative hypotheses Ha: B1~=0 (the slope estimate is not equal to 0).

summary(modelm)
## 
## Call:
## lm(formula = AvgAmmonia ~ AvgMass + attached, data = mussel)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.019e-03 -5.240e-04 -5.959e-05  3.429e-04  2.526e-03 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.0011398  0.0005533   2.060     0.05 *  
## AvgMass       0.2392793  0.0215863  11.085 3.86e-11 ***
## attachedRock -0.0025629  0.0003931  -6.519 7.91e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00103 on 25 degrees of freedom
## Multiple R-squared:  0.8574, Adjusted R-squared:  0.846 
## F-statistic: 75.18 on 2 and 25 DF,  p-value: 2.66e-11

We can see that our test stat is 11.085. The test stat is found by taking beta1hat/SE(beta1hat). The p-val (3.86e-11) for Average mass is much smaller than alpha=.05. Thus we can reject the null hypotheis. We can say that after for accounting for what the rock is attached to, avgerage mass and average ammonia have a significant linear relationship. We can interpret the slope, saying that no matter what the mussel is attached to, for every additional unit of mass, the amount of ammonia will increase by .2393 units.

Confidencce Interval for a Regression Coefficient

Next, let’s make a confidence interval for for the regression coefficient of Avg mass.

confint(modelm)
##                      2.5 %       97.5 %
## (Intercept)   1.999745e-07  0.002279427
## AvgMass       1.948215e-01  0.283737200
## attachedRock -3.372584e-03 -0.001753235

We can say we are 95% confident that no matter what the mussel is attached to, we predict for every additional unit of mass, the amount of amonia will go up from 1.948215e-01 to 0.283737200 units.

Confidence and Prediction Intervals (Given Values of Predictors)

Next, let’s make a CI and prediction interval Avg Ammonia (our response) given specific values of our predictors (avgMass and what it is attached to). We need to make a new data frame to store the specific values we are giving to our predictor variables. Let’s use .028 for average mass and “rock” for the attached variable.

data2<-data.frame(AvgMass=.028, attached="Rock")
data2
##   AvgMass attached
## 1   0.028     Rock

Now that we have our new data frame witht the specific values of xi (our predictor variables), we can use it to make confidence and prediction intervals.

pred1<-predict(modelm, data2,interval = "predict")
conf1<-predict(modelm, data2, interval = "confidence")
pred1
##           fit         lwr         upr
## 1 0.005276726 0.003078631 0.007474821
conf1
##           fit         lwr         upr
## 1 0.005276726 0.004702235 0.005851217

Our prediction interval is for our response (average ammonia) given Avergae mass is .028 and attacted=“Rock”. We can say we are 95% confident that when average mass is .028 and the mussel is attached to a rock, the amount of average ammonia will be between 0.003078631 and 0.007474821 units.

Our confidence interval is for the mean response (mean average ammonia) give avaerage mass is.028 and attached=“Rock”. we can say we are 95% confident that when average mass is .028 and the mussel is attached to a rock, the mean average ammonia will be between 0.004702235 and 0.005851217 units.

As it was with simple linear regression, our confidence interval is smaller in width than our prediction interval. This has to do with how variance of a pt is greater than variance of a mean.

Conclusion

The confidence interals we made today for multiple linear regression are very similar to the ones we made for simmple linear regression, with some differences in how we interpret them. We need to remember things like to hold the other variables constant if we are interpreting the regression coefficient of one variable when using multiple linear regression.