7.24

7.24 Nutrition at Starbucks, Part I. The scatterplot below shows the relationship between the number of calories and amount of carbohydrates (in grams) Starbucks food menu items contain. Since Starbucks only lists the number of calories on the display items, we are interested in predicting the amount of carbs a menu item has based on its calorie content.

(a) Describe the relationship between number of calories and amount of carbohydrates (in grams) that Starbucks food menu items contain.
(b) In this scenario, what are the explanatory and response variables?
(c) Why might we want to fit a regression line to these data?
(d) Do these data meet the conditions required for fitting a least squares line?

  1. Relationship between number of calories and amount of carbohydrates:
  • linear: no strong curvature
  • positive: slope upwards
  • weak: spread points
  • explanatory variable: calories
  • response variable: carbohydrates
  1. we might want to fit a regression line to these data that we can predict the amount of carbohydrates

  2. conditions for the least squares line:

  • linearity: yes linear
  • nearly normal residuals: yes the histogram looks symmetric and unimodel
  • constant variability: no, it’s not constant. in the histogram, the residuals look higher on the right than the left side
  • independent observations: yes assuming the data related to the food items are independent

No, these data do not meet the conditions required for fitting a least squares line

7.26

7.26 Body measurements, Part III. Exercise 7.15 introduces data on shoulder girth and height of a group of individuals. The mean shoulder girth is 107.20 cm with a standard deviation of 10.37 cm. The mean height is 171.14 cm with a standard deviation of 9.41 cm. The correlation between height and shoulder girth is 0.67.
(a) Write the equation of the regression line for predicting height.
(b) Interpret the slope and the intercept in this context.
(c) Calculate \(R^2\) of the regression line for predicting height from shoulder girth, and interpret it in the context of the application.
(d) A randomly selected student from your class has a shoulder girth of 100 cm. Predict the height of this student using the model.
(e) The student from part (d) is 160 cm tall. Calculate the residual, and explain what this residual means.
(f) A one year old has a shoulder girth of 56 cm. Would it be appropriate to use this linear model to predict the height of this child?

Given:

\(mean_{shoulder} = 107.2\)
\(sd_{shoulder} = 10.37\)
\(mean_{height} = 171.14\)
\(sd_{height} = 9.41\)
\(R = 0.67\)

  1. the equation of the regression line for predicting height is: \(y = {\beta}_0 + {\beta}_1 * x\)
mean_shoulder<-107.2
sd_shoulder<-10.37
mean_height<-171.14
sd_height<-9.41
R<-0.67

# find b0 and b1
b1<-R*(sd_height/sd_shoulder)
b0<-mean_height-(b1*mean_shoulder)

y = 105.9650878 + 0.6079749*x

  1. slope = \({\beta}_1\) = 0.6079749, for x=0: intercept = 105.9650878

  2. \(R^2\) = 0.4489 = 44.89 %. 44.89 % is the proportion os the straight line relationship between the girth and height

  3. x = 100. predicted_height = 105.9650878 + 0.6079749*x = 166.7625805

  4. y = 160. The residual is: \(e_i\) = observed - predicted = y - predicted_height = -6.7625805. This means that the height was overestimated by 6.7625805

  5. 1_year_old_shoulder_girth = 56. Based on the scatterplot, the range of the shoulder girth is: 80 < shoulder_girth < 140. 1_year_old_shoulder_girth is outside this range. This leads to say that it’s not appropriate to use this linear model to predict the height of this child.

7.30

7.30 Cats, Part I. The following regression output is for predicting the heart weight (in g) of cats from their body weight (in kg). The coefficients are estimated using a dataset of 144 domestic cats.

(a) Write out the linear model.
(b) Interpret the intercept.
(c) Interpret the slope.
(d) Interpret \(R^2\).
(e) Calculate the correlation coefficient.

  1. Formula: \(y = {\beta}_0 + {\beta}_1 * x\)

From the pic, we can conclude that \({\beta}_0 = (estimate, intercept) = -0.357\) and \({\beta}_1 = (estimate, body wt) = 4.034\)

y = -0.357 + 4.034 * x

  1. The intercept is \({\beta}_0\) = -0.357. This means that for a body weight 0 kg the heart weight is -0.357 grams.

  2. The slope is \({\beta}_1\) = 4.034. This means that in average of 4.034 grams the heart weight increases per body weight kilogram.

  3. Given \(R^2\) = 64.66% = 0.6466.This means the proportion in the heart weight relative to the straight line between the body weight and heart weight.

  4. r = correlation coefficient = 0.8041144

7.40

(a) Given that the average standardized beauty score is -0.0883 and average teaching evaluation score is 3.9983, calculate the slope. Alternatively, the slope may be computed using just the information provided in the model summary table.
(b) Do these data provide convincing evidence that the slope of the relationship between teaching evaluation and beauty is positive? Explain your reasoning.
(c) List the conditions required for linear regression and check if each one is satisfied for this model based on the following diagnostic plots.


Formula: \(y = {\beta}_0 + {\beta}_1 * x\)

  1. Given: x = -0.0883 and y = 3.9983.

\({\beta}_0 = (estimate, intercept) = 4.010\)

\({\beta}_1\) = (y - \({\beta}_0\)) / x = 0.1325028

  1. n = 463. Let’s consider these hypotheses:

\(H_0\): \({\beta}_1\) = 0

\(H_1\): \({\beta}_1\) > 0

Let’s find the value of the test statistic: t = (\({\beta}_1\) - 0) / std_error_beauty = 4.1149948. The p_value, with a df = n-2 = 463-2 = 461, is p_value < 0.005 which < 0.05 => then \(H_0\) is rejected.

With \({\beta}_1\) > 0, we conclude that the slope of the relationship between teaching evaluation and beauty is positive.

  1. conditions for the linear regression:
  • linearity: yes linear, no strong curvature
  • nearly normal residuals: yes the histogram looks symmetric and unimodel
  • constant variability: satisfied because in the histogram, the residual plot shows how the spread is same everywhere
  • independent observations: yes assuming the professores were randomly selected and are independent