Getting Data
Here , X = Minutes of Exposure
Y = Number of Bacteria
x <- c(1:12)
y <- c(175,108,95,82,71,50,49,31,28,17,16,11)
A) Fit a simple linear regression model to the data. What Is the value of R^2?
model <- lm(y~x)
summary(model)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.323 -9.890 -7.323 2.463 45.282
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 142.20 11.26 12.627 1.81e-07 ***
## x -12.48 1.53 -8.155 9.94e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.3 on 10 degrees of freedom
## Multiple R-squared: 0.8693, Adjusted R-squared: 0.8562
## F-statistic: 66.51 on 1 and 10 DF, p-value: 9.944e-06
As we can see from above summary of our fitted regression model that value of R^2 is 0.8693 .
From above plots of residuals vs fitted, we can see that it doesnt seems random pattern .hence we can state that variance is not constant in our model , but we should also keep in mind that we have only 12 data points so our interpretation may not be accurate
From Normality plot , we can see that data points doesnot fall fairly in a straight line , Hence we can state that we dont have normality , but we should also keep in mind that we have only 12 data points so our interpretation may not be accurate
As we can see that value for power transform is very close to 0 , hence we will take it as zero .
Above data is the transformed values of response
We can see from summary above that value of R^2 is 0.9822
We can see from above residual vs fitted plot that , the plot shows no patter , hence we can claim that do have a constant variance , but we should also keep in mind that we have only 12 data points so our interpretation may not be accurate
NOrmality plot also shows that data points fall fairly on straight line , hence we claim that our assumption of normality on transformed data holds true , but we should also keep in mind that we have only 12 data points so our interpretation may not be accurate
This all shows that our transformed model is better than our old model
F) Estimate the number of bacteria at 10 minutes of exposure, how does this compare with the observed value?
p <- c(10)
pred <- predict(model2,data.frame(x=p),interval = "prediction")
ans <- exp(pred[1])
ans
## [1] 19.62999
Hence the Estimate the number of bacteria at 10 minutes of exposure is 19.62999
G) Provide a 95% prediction interval on the number of bacteria at 10 minutes of exposure.
predin <- predict(model2,data.frame(x=p),interval = "prediction")
lower <- exp(predin[2])
upper <- exp(predin[3])
lower
## [1] 14.68805
upper
## [1] 26.2347
Hence prediction interval for 10 minutes will be as follows
Lower = 14.6880484
upper = 26.2347038