Learning Log 6: Sections 4.5 & 4.6

For this learning log, I will be using the mussel data set.

muss <- read.csv(url("http://cknudson.com/data/mussels.csv"))
attach(muss)
names(muss)
##  [1] "GroupID"    "dry.mass"   "count"      "attached"   "lipid"     
##  [6] "protein"    "carbo"      "ash"        "Kcal"       "ammonia"   
## [11] "O2"         "AvgAmmonia" "AvgO2"      "AvgMass"

I will now create a multiple linear regression model, using AvgMass and attached as my predictors, and AvgAmmonia as my response.

mymod <- lm(AvgAmmonia ~ AvgMass + attached, muss)
summary(mymod)
## 
## Call:
## lm(formula = AvgAmmonia ~ AvgMass + attached, data = muss)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.019e-03 -5.240e-04 -5.959e-05  3.429e-04  2.526e-03 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.0011398  0.0005533   2.060     0.05 *  
## AvgMass       0.2392793  0.0215863  11.085 3.86e-11 ***
## attachedRock -0.0025629  0.0003931  -6.519 7.91e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00103 on 25 degrees of freedom
## Multiple R-squared:  0.8574, Adjusted R-squared:  0.846 
## F-statistic: 75.18 on 2 and 25 DF,  p-value: 2.66e-11

Hypothesis Test

Let’s do a hypothesis test for the AvgMass coefficient. Let \(\beta_1\) be the regression coefficient for AvgMass.

\[H_0 : \beta_1 = 0\] \[H_a : \beta_1 \neq 0\]

We can get the test stat and p-val from the summary of the model above. test stat: 11.085 p-val: 3.86e-11

Thus, since our p-val is sufficiently small enough, we have enough evidence to reject the null hypothesis. In other words, after accounting for the other predictor, what the mussels are attached to, it can be seen that the average mass of the mussel and its average ammonia output have a linear relationship.

Time for a confidence interval!

confint(mymod)
##                      2.5 %       97.5 %
## (Intercept)   1.999745e-07  0.002279427
## AvgMass       1.948215e-01  0.283737200
## attachedRock -3.372584e-03 -0.001753235

I am 95% confident that the true regression coefficient for AvgMass, \(\beta_1\), lies between [0.1948, 0.2837]. In other words, I am 95% confident that regardless of what object the zebra mussels are attached to, for every gram increase in the mass, the ammonia output will increase between 0.1948 and 0.2837 mg/h.

CI and PI for a mean value of y given a value of x

I’ll now calculate a confidence interval for the mean value of the ammonia output given specific values for the AvgMass and what the mussel is attached to. Let’s calc a CI with AvgMass = 0.025 for mussels that are attached to a rock.

newmuss<- data.frame(attached = "Rock", AvgMass = 0.025)
(confAmmonia<-predict(mymod, newmuss, interval = "confidence"))
##           fit         lwr         upr
## 1 0.004558888 0.004009637 0.005108139

I am 95% confident that the mean AvgAmmonia output for zebra mussels with a mass of 0.025 that are attached to rock is between 0.004010 and 0.005108 mg/h.

Let’s now calculate a prediction interval for an invidual value of y with our same values for AvgMass and AvgAmmonia.

(predAmmonia<-predict(mymod, newmuss, interval = "prediction"))
##           fit         lwr         upr
## 1 0.004558888 0.002367254 0.006750522

Thus I am 95% confident that the AvgAmonnia output for a single zebra mussel with a mass of 0.025 that is attached to a rock is between 0.002367 and 0.006751 mg/h.