Problem 7.24: Nutrition at Starbucks, Part I

  1. Based on the scatterplot, there is linear relationship but the relationship is not strong enough. The linear relationship could be due to cloud of points in the lower end but on the upper end, it gets more scatter.
  2. Amount of carbs(in grams) is response variable while number of calories is explanatory variable.
  3. To predict the amount of carbohydrates (in grams) via number of calories.
  4. There is linearity but the relationship is weak. Histogram shows that the data is slightly skewed and not symmetric. To some extent, points are near 0 but there are still some points that is slightly farther from 0. I have to assume that the each cases are independent of each other. To summarize the conditions that need to be met, we cannot tell 100% that it meets the basic requirements as there are some problems in data. Result may vary from statisticians to statisticians.

Problem 7.26: Body Measurements, Part III

Sy <- 9.41
Sx <- 10.37
R <- 0.67
y2 <- 171.14
x2 <- 107.2
b1 <- (Sy/Sx)*R
b1
## [1] 0.6079749

B1 is 0.6079749.

To find Bo,

$ y-y2 = b1(x-x2)

We will manipulate the above formula to get Bo

b0 <- y2 - (b1*x2)
b0
## [1] 105.9651

The equation for the regression is:

height = 105.9651 + 0.6079749*(shoulder girth)

  1. Slope: It represents the number of cm increase in height with the increase of 1 unit in shoulder girth. Intercept: It shows the height in cm when shoulder girth is 0 cm which is not possible logically.

r5 <- R^2
r5
## [1] 0.4489

It shows that the model predicts 44.89% of the variation of height data.

x <- 100
y=b0 + (b1*x)
y
## [1] 166.7626

The height for the randomly selected person would be 166.7627cm.

  1. residual_height <- yi - yi2
residual_height <- 160 - 166.7627
residual_height
## [1] -6.7627

Residual for that student is -6.7627. The model was overestimated as the value came in negative.

  1. No, if you take a look at scatterplot from problem 7.15, shoulder girth starts from 85 cm and should girth of one year old kid of 56 cm is far more lower than the actual data that’s why it is not in the sample.

Problem 7.30: Cats, Part I

  1. heartweight = -0.357 + 4.034(body wt)
  2. That means that cat’s body weight will be 0 with negative heart weight which is not possible.
  3. With the 1 kg increase in body weight, heart weight will increase with 4.034 grams.
  4. It means that body weight was predicted by heart weight with 64.66%.
correlation_coefficient <- sqrt(0.6466)
correlation_coefficient
## [1] 0.8041144

Problem 7.40: Rate my professor

y = Bo + B1*x

bo <- 4.010
x1 <- -0.0883
y2 <- 3.9983
b1 <- (y2 - bo)/x1
b1
## [1] 0.1325028

Slope is 0.1325028.

  1. Slope estimate is more than 4 which is more than 0 which appears good to reject the null hypothesis.
  2. There is no pattern in the residuals so it seems okay. We assume that the cases are independent, Furthermore, there is no variance in both lower and upper ends so it confirms another rule(Constant variance of residuals). Data seems to be approximately normal too so overall all the conditions are met for linear regression.