This question is asking you to create a prediction interval for transformed data. Since we had to do log(y), we have to do \(e^y\) to get back to our origional scale.
read in data
x = c(5,10,15,20,25,30,45,60)
y = c(16.3,9.8,8.2,4.3,3.5,2.8,1.8,1.2)
Take a look at the data - it appears to be non-linear
plot(y~x, main = "Question 1 Data")
Because of the look of the graph (and the hint in part c), we know we want to fit a model where x and y have the following relationship.
Experience will improve your intuition as to why this is the correct type of transformation.
\(y =\alpha x^{\beta}\)
We do this by transforming the data by taking the natural log of both x and y.
plot(log(y)~log(x), main = "Question 1 Transformed")
Lets create our model this way.
model = lm(log(y)~log(x), main = "Question 1 Data")
## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
## extra argument 'main' will be disregarded
summary(model)
##
## Call:
## lm(formula = log(y) ~ log(x), main = "Question 1 Data")
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.19260 -0.04759 -0.01764 0.01953 0.30873
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7247 0.2295 20.58 8.55e-07 ***
## log(x) -1.0817 0.0738 -14.66 6.33e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1576 on 6 degrees of freedom
## Multiple R-squared: 0.9728, Adjusted R-squared: 0.9683
## F-statistic: 214.8 on 1 and 6 DF, p-value: 6.334e-06
This looks good and all, but what about the prediction?
new = data.frame(x = 15)
predict.lm(model, new, interval = c("prediction"))
## fit lwr upr
## 1 1.795406 1.382595 2.208217
This answer isn’t quite correct. Since our values are transformed, we need to untransform them - so we will…
exp(predict.lm(model, new, interval = c("prediction")))
## fit lwr upr
## 1 6.02192 3.985232 9.099475
Input data:
cyc = c(1326,1593,4414,5673,29516,26,843,1016,3410,7101,7356,7904,79,4175,34676,114789,2672,7532,30220)
str = c(0.01465,0.01560,0.01110,0.01160,0.00943,0.01899,0.00840,0.00701,0.00620,0.00535,0.00516,0.00550,0.01302,0.00702,0.00496,0.00660,0.00830,0.00943,0.00736)
Lets look at the Plot
plot(str~cyc, main = "Question 2 Data")
Below the data in WebAssign, the question asks you to consider using this type of model: Experience will improve your intuition as to why this is the correct type of transformation.
\(y = \alpha+\beta ln{x}\)
This means we will only transform our x or cyc.
Lets look at the plot to improve our intuition and veryfy what the book is telling us:
plot(str~log(cyc), main = "Question 2 Transformed")
Ok, now conduct the regression:
model2 = lm(str~log(cyc))
summary(model2)
##
## Call:
## lm(formula = str ~ log(cyc))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.003947 -0.002749 -0.001233 0.002544 0.005250
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0202997 0.0029398 6.905 2.55e-06 ***
## log(cyc) -0.0013494 0.0003491 -3.866 0.00124 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.002979 on 17 degrees of freedom
## Multiple R-squared: 0.4678, Adjusted R-squared: 0.4365
## F-statistic: 14.94 on 1 and 17 DF, p-value: 0.00124
This answers the y = , t =, P-Value, and \(R^2\) question.
Lastly, the prediction interval.
We could use the formula I taugh in the previous block, or we can use our R code.
q15prediction = data.frame(cyc = 3000)
predict.lm(model2, newdata = q15prediction, interval = "prediction")
## fit lwr upr
## 1 0.009495973 0.003046094 0.01594585