Graded: 7.24, 7.26, 7.30, 7.40

7.24 Nutrition at Starbucks pt. 1

1. There is a linear relationship between calroies and carbs. The linear regression shows that the more calories in the food the more carbs it contains.
2. The explanatory variable - calories, response variable - carbs.
3. We would want to fit a regression line to this data to find out if there any relashionship between two variables: calories and cardbs.
4.Conditions for the least squares line: 
  -Linearity: The data shows a linear trend
  - Nearly Normal residuals: the distributions seems to be normal. 
  - Constant variability: The variability of points does not remain roughly constant around the least squares line.
   -Independent observations: The observation are dependent from each other. The more calories - the mor carbs. 
Conclusion: The data does not meet the conditions required for fitting a least squares line.

7.26 Exercise 7.15 introduces data on shoulder girth and height of a group of individuals. The mean shoulder girth is 107.20 cm with a standard deviation of 10.37 cm. The mean height is 171.14 cm with a standard deviation of 9.41 cm. The correlation between height and shoulder girth is 0.67

  1. Write the equation of the regression line for predicting height.
#slope

b1 <- 0.67*(9.41/10.37)
b1
## [1] 0.6079749
#intercept

b0 <-171.14-b1*107.2
b0
## [1] 105.9651

The equation for the regression line is y = 105.97 + 0.61x

  1. Interpret the slope and the intercept in this context.

For each additional cm in shoulder girth we expect an additional 0.61 cm in height. In this context, we can interpret intercept b0 if shoulder girth more than 0cm. Zero shoulder girth does not make sense in this context.

  1. Calculate R2 of the regression line for predicting height from shoulder girth, and interpret it in the context of the application.
R2 <- 0.67^2
R2
## [1] 0.4489

This means that approximately 45% of variation in the provided response that is explained by the least squres line.

  1. A randomly selected student from your class has a shoulder girth of 100 cm. Predict the height of this student using the model.
## Student height
sthght <-b0 + b1 * 100
sthght
## [1] 166.7626

Tthe height is 166.76 cm.

  1. The student from part (d) is 160 cm tall. Calculate the residual, and explain what this residual means.
##residual
res <- 160 - sthght
res
## [1] -6.762581

The residual is -6.76. A negative residual means the that the model overestimates the observation (height).

  1. A one year old has a shoulder girth of 56 cm. Would it be appropriate to use this linear model to predict the height of this child?

No, the smallest shoulder girth in the data set is approximately 85 cM, therefore it would not be appropriate to use this linear model to predict the height of this child?

7.30 The following regression output is for predicting the heart weight (in g) of cats from their body weight (in kg). The coefficients are estimated using a dataset of 144 domestic cats. Picture

  1. Write out the linear model.

heart weight = -0.357+4.034*body weight

  1. Interpret the intercept

he intercept tells us that for a body weight of zero, the heart weight is -0.357g, therefore In this context, the intercept has no meaning.

  1. Interpret the slope.

For each additional kg in body weight we estimate an additional 4.034g in heart weight.

  1. Interpret R2

R2 means that approximately 65% of the variability in heart weight is explained by body weight in this model.

  1. Calculate the correlation coefficient.
r <- sqrt(64.66/100)

The correlation coefficient is 0.804.

7.40 Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics, such as the physical appearance of the instructor. Researchers at University of Texas, Austin collected data on teaching evaluation score (higher score means better) and standardized beauty score (a score of 0 means average, negative score means below average, and a positive score means above average) for a sample of 463 professors. The scatterplot below shows the relationship between these variables, and also provided is a regression output for predicting teaching evaluation score from beauty score.

Picture

Picture

  1. Given that the average standardized beauty score is -0.0883 and average teaching evaluation score is 3.9983, calculate the slope. Alternatively, the slope may be computed using just the information provided in the model summary table.
## slope
b1 <- (4.010-3.9983) / (0-(-0.0883))
b1
## [1] 0.1325028
  1. Do these data provide convincing evidence that the slope of the relationship between teaching evaluation and beauty is positive? Explain your reasoning.

Yes, we conclude that the data provides convincing evidence of a relationship between teaching evaluation and beauty due to the t-value IS 4.13 and corresponding p-value of zero. That means we can reject the null hypothesis and conclude that there is a positive relationship between teaching evaluation and beauty.

  1. List the conditions required for linear regression and check if each one is satisfied for this model based on the following diagnostic plots. Linearity - data appears to be linear Nearly Normal Residuals - data appears to have nearly normal residuals (slightly skewed to the left) Constant Variability - The variability of points does not remain roughly constant around the least squares line.The points are seem evenly spread out in the plot. Independent Observations - we don’t know if the observation is independent in this case, so we assume that is is indepentent and we can use a least squares regression line.
Picture

Picture