Least Squares Linear Regression

Exercise 1 Reconsider the relationship between city air particulate and rates of childhood asthma first discussed in Homework 10. We sample 15 cities for particulate measured in parts-per-million (ppm) of large particulate matter and for the rate of childhood asthma measured in percents.

variable: size mean variance
x 15 11.42 13.05029
y 15 14.51333 2.635524
  1. Suppose we sample a new city whose particulate is 13 ppm. If reasonable, create a 95% interval for the predicted rate of childhood asthma in this city. If not reasonable, explain why.
particulate <- c(11.6, 15.9, 15.7, 7.9, 6.3, 13.7, 13.1, 10.8,
                 6.0, 7.6, 14.8, 7.4, 16.2, 13.1, 11.2)
asthma <- c(14.5, 16.6, 16.5, 12.6, 12.0, 15.8, 15.1, 14.2,
            12.2, 13.1, 16.0, 12.9, 16.4, 15.4, 14.4)


asthma_model = lm(asthma ~ particulate)
summary(asthma_model)
## 
## Call:
## lm(formula = asthma ~ particulate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.34226 -0.12842 -0.01514  0.12130  0.29164 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.41626    0.17344   54.29  < 2e-16 ***
## particulate  0.44633    0.01452   30.73  1.6e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1963 on 13 degrees of freedom
## Multiple R-squared:  0.9864, Adjusted R-squared:  0.9854 
## F-statistic: 944.4 on 1 and 13 DF,  p-value: 1.595e-13
qt(0.975, df = 13)
## [1] 2.160369
new_model = data.frame(particulate = 13 )

predict(asthma_model, newdata = new_model, interval = "prediction")
##        fit      lwr      upr
## 1 15.21853 14.77771 15.65936
nnew_model = data.frame(particulate = 10)
predict(asthma_model, newdata = nnew_model, interval = "confidence")
##        fit      lwr      upr
## 1 13.87955 13.76132 13.99777
  1. Create a 95% confidence interval for the average rate of childhood asthma among all cities with 10 ppm of large particulate. Is this confidence interval wider or narrower than a 95% prediction interval for the rate of childhood asthma in the next city with 10 ppm of large particulate? Explain why you know this without having to build the PI.
  1. If reasonable, create a 95% interval for the predicted rate of childhood asthma in the next city sampled that has 3 ppm of large particulate. If this is not reasonable, explain why.

Exercise 2. In the paper “Artificial Trees as a Cavity Substrate for Woodpeckers”, scientists provided polystyrene cylinders as an alternative roost. The paper related values of x = ambient temperature (C) and y = cavity depth (cm). A scatterplot in the paper showed a strong linear relationship between x and y. The summary values for x and y are given below:

Variable Size Mean Variance
Temp (x) 12 10.92 137.17
Depth (y) 12 16.36 21.28

A least-squares linear model for (Depth ~ Temp) was fit, and the intercept was estimated to be 20.12506 with standard error 0.94023. The slope was estimated to be -0.34504 with standard error 0.06008. The MSE for the model is \(2.334^2\).

  1. Determine the sample correlation (r) from the summaries given.
2*pt(-5.743, df = 10)
## [1] 0.0001869496
pt(-5.753, df = 10,)
## [1] 9.22048e-05
pt(-5.753, df = 10, lower.tail = FALSE)
## [1] 0.9999078
2*pt(2.579, df = 10, lower.tail = FALSE)
## [1] 0.02746338
qt(0.99, df = 10)
## [1] 2.763769
qt(0.975, df = 10)
## [1] 2.228139
  1. Give the linear regression model with least squares estimates for \(\beta_0\) and \(\beta_1\) relating ambient temperature (x) and hole depth (y).

Determine test statistics and p values for the tests in parts c-f:

  1. \(H_0: \beta_1=0\) vs \(H_A: \beta_1 \ne 0\)
  1. \(H_0: \beta_1 \ge 0\) vs \(H_A: \beta_1 < 0\)
  1. \(H_0: \beta_1 \le 0\) vs \(H_A: \beta_1 > 0\)
  1. \(H_0: \beta_1=-0.5\) vs \(H_A: \beta_1 \ne -0.5\)
  1. Compute and interpret a 98% confidence interval for the slope of the regression line \(\beta_1\).
  1. Construct a 95% prediction interval for the cavity depth of the next hole when ambient temperature is 1 degree Celsius (this temperature value is within the range of those in the original study).