Include the test and CI interval for regular coefficient of the mussels data. Include the full process (hypothesis, test stat, pval, conclusion in context of problem) for HI. Include CI interpretation.
Start by loading the data. And plotting it
mdata<-read.csv(url("http://cknudson.com/data/mussels.csv"))
mdata
## GroupID dry.mass count attached lipid protein carbo ash Kcal ammonia
## 1 1 0.55 20 Rock 8.14 47.43 21.59 5.51 3.61 0.07
## 2 2 0.45 19 Rock 9.34 53.89 23.41 6.34 4.06 0.07
## 3 3 0.37 20 Rock 9.12 49.01 21.10 5.63 3.74 0.07
## 4 4 0.63 20 Rock 10.32 49.25 16.55 5.41 3.66 0.11
## 5 5 0.57 20 Rock 10.08 50.17 17.51 6.10 3.72 0.11
## 6 6 0.57 22 Rock 10.83 53.84 19.97 6.36 4.04 0.11
## 7 7 0.56 20 Rock 9.37 47.75 28.95 5.20 4.03 0.11
## 8 8 0.54 20 Rock 9.44 62.99 22.08 5.58 4.41 0.10
## 9 9 0.55 21 Rock 9.84 53.89 24.61 5.77 4.15 0.10
## 10 10 0.55 20 Rock 10.18 55.34 29.89 6.09 4.46 0.10
## 11 11 0.43 20 Rock 10.75 49.92 25.06 NA 4.08 0.08
## 12 12 0.51 19 Rock 9.98 54.16 20.52 5.55 4.01 0.09
## 13 13 0.34 20 Rock 10.44 58.43 21.85 5.44 4.29 0.07
## 14 14 0.34 20 Rock 10.24 53.94 20.09 6.37 4.00 0.07
## 15 15 0.30 20 Rock 9.00 71.94 18.52 4.62 4.61 0.05
## 16 16 0.83 34 Amblema 10.33 62.63 14.51 6.03 4.16 0.21
## 17 17 0.49 31 Amblema 7.91 59.62 17.40 5.42 3.94 0.09
## 18 18 0.46 72 Amblema 8.28 56.67 15.76 5.17 3.78 0.18
## 19 19 0.44 23 Amblema 6.82 53.20 9.97 5.39 3.27 0.14
## 20 20 1.16 27 Amblema 9.07 62.99 15.94 6.45 4.12 0.35
## 21 22 1.10 23 Amblema 6.50 59.36 22.95 6.57 4.04 0.25
## 22 23 1.07 42 Amblema 8.13 56.64 18.39 5.16 3.87 0.41
## 23 24 0.36 38 Amblema 8.27 43.49 21.10 4.91 3.43 0.11
## 24 25 0.47 38 Amblema 10.30 54.12 16.81 5.33 3.88 0.13
## 25 27 0.43 26 Amblema 9.53 52.88 15.60 5.55 3.71 0.12
## 26 28 0.89 28 Amblema 8.19 60.14 NA 5.53 NA 0.25
## 27 29 0.70 31 Amblema 9.44 48.96 20.83 5.76 3.75 0.26
## 28 30 0.68 64 Amblema 8.04 NA 17.89 NA NA 0.23
## O2 AvgAmmonia AvgO2 AvgMass
## 1 0.82 0.003500000 0.04100 0.027500
## 2 0.70 0.003684210 0.03684 0.023684
## 3 0.62 0.003500000 0.03100 0.018500
## 4 0.89 0.005500000 0.04450 0.031500
## 5 1.09 0.005500000 0.05450 0.028500
## 6 1.00 0.005000000 0.04545 0.025909
## 7 0.92 0.005500000 0.04600 0.028000
## 8 0.92 0.005000000 0.04600 0.027000
## 9 0.93 0.004761905 0.04429 0.026190
## 10 0.92 0.005000000 0.04600 0.027500
## 11 0.77 0.004000000 0.03850 0.021500
## 12 0.80 0.004736842 0.04211 0.026842
## 13 0.57 0.003500000 0.02850 0.017000
## 14 0.62 0.003500000 0.03100 0.017000
## 15 0.38 0.002500000 0.01900 0.015000
## 16 1.93 0.006176471 0.05676 0.024412
## 17 0.84 0.002903226 0.02710 0.015806
## 18 1.05 0.002500000 0.01458 0.006389
## 19 1.05 0.006086956 0.04565 0.019130
## 20 2.39 0.012962963 0.08852 0.042963
## 21 2.07 0.010869565 0.09000 0.047826
## 22 2.77 0.009761905 0.06595 0.025476
## 23 0.79 0.002894737 0.02079 0.009474
## 24 0.97 0.003421053 0.02553 0.012368
## 25 1.24 0.004615385 0.04769 0.016538
## 26 2.68 0.008928571 0.09571 0.031786
## 27 2.26 0.008387097 0.07290 0.022581
## 28 2.05 0.003593750 0.03203 0.010625
attach(mdata)
muss <- lm( AvgAmmonia ~ AvgMass + attached , mdata)
muss
##
## Call:
## lm(formula = AvgAmmonia ~ AvgMass + attached, data = mdata)
##
## Coefficients:
## (Intercept) AvgMass attachedRock
## 0.001140 0.239279 -0.002563
This creates a linear regression where the Average Ammonia depends on the Average Mass of the clump of mussels and if the mussels are attached to a rock or Amblema. Thus, this is a multiple regression model because the average ammonia depends on more than one factor – the mass and what it’s attached to .
So the formula is:
AverageAmmonia = .001140 + (.239279)*(AvgMass) + -.002563(rock)
summary(muss)
##
## Call:
## lm(formula = AvgAmmonia ~ AvgMass + attached, data = mdata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.019e-03 -5.240e-04 -5.959e-05 3.429e-04 2.526e-03
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0011398 0.0005533 2.060 0.05 *
## AvgMass 0.2392793 0.0215863 11.085 3.86e-11 ***
## attachedRock -0.0025629 0.0003931 -6.519 7.91e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.00103 on 25 degrees of freedom
## Multiple R-squared: 0.8574, Adjusted R-squared: 0.846
## F-statistic: 75.18 on 2 and 25 DF, p-value: 2.66e-11
Let’s take a look at the given data. We see that the average mass slope is .23928 and the attached rock slope is -.00257. We see that the test stat is 11.085 for the Avg Mass and -6.519 for the attackedRock. We see that the p-value is 2.66*10^-11. If we let H0 : B1 = 0 HA: B1 ~= 0 Then we see that p-value < alpha = .05 and thus we can reject the null hypothesis and see that B1 does not equal zero.
newData <- data.frame(AvgMass=.0275, attached="Rock")
confy <- predict(muss, newData, interval = "confidence")
confy
## fit lwr upr
## 1 0.005157086 0.004588894 0.005725278
Here is the confidence interval for the average AverageWeight of the Ammonia given that the Average Mass is .0275 and that it is attached to a rock.
newData <- data.frame(AvgMass=.0275, attached="Rock")
predy <- predict(muss, newData, interval = "predict")
predy
## fit lwr upr
## 1 0.005157086 0.002960629 0.007353544
This is the prediction interval for when the AvgMass is .0275.
c <- confy %*% c(0, -1, 1) #conf interval width
p <- predy %*% c(0, -1, 1) #pred interval width
Let’s compare the width of the confidence interval to the width of the prediction interval:
c - p
## [,1]
## 1 -0.003256531
So the confidence interval is more narrow which is expected
confint(muss)
## 2.5 % 97.5 %
## (Intercept) 1.999745e-07 0.002279427
## AvgMass 1.948215e-01 0.283737200
## attachedRock -3.372584e-03 -0.001753235
Here is the confidence intervals for the data. This means that 95% of the data will fall between these ranges