1. Generally, as the calories increase, so do the carbs.

  2. The explanatory variable is calories. The response is carbs.

  3. To predict what the calories of a Starbucks menu item will be, given its carb number.

  4. Yes. The first graph shows its linearity, and the variability appears to be constant. The bar graph shows that its residuals follow a normal distribution.

  1. The book defines the regression line as: \[ height = \beta_{0} + \beta_{1} \times shoulder girth \]

  2. The slope is: \[ b_{1} = \frac{s_{y}}{s_{x}}R \] Where we will use the given standard deviations for the shoulder and height. So all told it will be: \[ b_{1} = \frac{9.41}{10.37}0.67 \]

b1 <- (9.41/10.37) * 0.67
b1
## [1] 0.6079749

For the y-intercept: \[ y - y_{0} = slope \times (x - x_{0}) \]

We will use the means as our (x, y), and we have the slope, so we can solve for y by: \[ y - 171.14 = 0.608 \times (0 - 107.20) \]

y <- (b1 * -107.20) + 171.14
y
## [1] 105.9651

This means if shoulder girth is 0, then height is 105.965 cm. Also, we would expect a person to gain 0.608 cm in height per every 1 cm gained in shoulder width.

  1. \(R^2\) is the correlation squared, so in this case it’s \(0.67^2\), or 0.4489, or there was a reduction of 45% in the data’s variation by using information about shoulder width to predict height.

  2. So we use the equation for a, given a student with a shoulder girth of 100 cm.

\[ height = 105.965 + 0.608 \times 100 \]

Or, we would expect the student to have a height of 165.965

  1. The residual is actual - expected, so the residual is 160 - 165.965, or -5.965, meaning we overestimated the height by 6 cm.

  2. The data in 7.15 starts at 80 cm mark, given the chart. While it’s possible to use the linear model this is such an outlier that it’s more likely extrapolation.

  1. The linear model would be \(heart weight = -0.357 + 4.034 \times body weight\)

  2. If a cat has a body weight of 0, then their heart weight would be -0.357. Not really realistic, but then a body weight of 0 is more of an outlier.

  3. For each kilogram of body weight, we expect the cat’s heart weight to be another 4.034 grams

  4. The model’s \(R^2\) of 64.66% means the model’s reduced the data’s variation by that much.

  5. The correlation coeffecient is R, so we that the square root of \(R^2\), or \(sqrt{0.6466}\) in this case, which is:

sqrt(0.6466)
## [1] 0.8041144

  1. We have the y-intercept as 4.010. Let’s try to do what we did in 7.25, but this time solve for the slope, using the \(y - y_{0} = slope \times (x - x_{0})\). \[ 4.010 - 3.9983 = slope \times (0 - -0.0883) \]

\[ 0.0117 = slope \times 0.0883 \]

\[ \frac{0.0117}{0.0883} = slope \]

Slope is 0.1325028

  1. So, there is a slight upward trend in the scatterplot, but the slope itself shows how minimal that upward trend is. If we were to use this for a linear model, this means that for every positive score in a teacher’s attractiveness, we would only see a 0.13 increase in their evaluation score. That’s not terribly significant.

  2. The data is mostly linear. The residual plot doesn’t show a curve and are randomly scattered. The histogram shows the residuals are nearly normal. The QQ plot does, however, have multiple extreme outliers curving away from the regression line. But the variability around the line does seem mostly constant. Overall, I think the conditions are met.