7.24, 7.26, 7.30, 7.40
#7.24
#a) They have a positive linear relationship. As Calories increase, Carb also increase, on average.
#b) Calories is explanatory variable where as Carb is response variable
#c) We want to fit a regression line to examine whether we are doing better or worse compared to theoretical line. This comparison (residual) gives an idea whether there is a true relationship between indepedent and dependent variable. If residual for each oberserved value is too large we can say that explanatory variable cannot explain the dependent variable effectively and thus model is not a good approximation for predicting the outcome we want.
#d) Residuals are mostly centered around 0 and the slope of regression line seems to be fairly high enough. We can say data meets the conditions required for fitting a least squares line.
#7.26
#a)
mean_height = 171.14
mean_girth = 107.2
sd_height = 9.41
sd_girth = 10.37
corr_height_girth = 0.67
beta = (sd_height/sd_girth)*corr_height_girth
beta
## [1] 0.6079749
###y - mean_height = beta*(girth - mean_girth)
## y = beta*girth - (beta*mean_girth) + mean_height
-beta*mean_girth + mean_height
## [1] 105.9651
### The equation is like this. y_fit = constant + (beta*girth) so y_fit = 105.9651 + (0.6079749*x)
#b)
#The slope is 0.608 and it means 1 unit increase in girth increases height by 0.608, on average. The constant means that when observed value of girth is 0, height should be 105.97, on average.
#c)
corr_height_girth^2
## [1] 0.4489
#44.89% of variation in height can be explained by girth.
#d)
x = 100
y_fit = 105.9651 + (0.6079749 * x)
y_fit
## [1] 166.7626
#We can predict height of this student to be 166.76cm
#e)
y = 160
residual = y - y_fit
residual
## [1] -6.76259
# Residual is around -6.76 and it means the model is overestimating the height. There is a difference between model and actual observed value by -6.76.
#f)
x = 56
y_fit = 105.9651 + (0.6079749 * x)
y_fit
## [1] 140.0117
#From the plot on ex 7.15, it seems like the minimum girth of sample is around 85cm. Anything below 85cm should not be used thus sample of 56cm shoulder girth should not be used for this linear model.
#7.30
#a) heart_weight_fit = -0.357 + 4.034*body_weight
#b) If body weight is 0, heart weight will be -0.357 on average.
#c) 1 unit increase in body weight will increase heart weight by 4.034 on average.
#d) 64.66% of variation in heart weight can be explained by body weight.
#e)
R_squared = 0.6466
corr_heart_body = sqrt(R_squared)
corr_heart_body
## [1] 0.8041144
#7.40
#a)
y_sample_mean = 3.9983
x_sample_mean = -0.0883
intercept = 4.010
beta = (y_sample_mean - intercept ) / x_sample_mean
beta
## [1] 0.1325028
#b)
#Given that beta is positive, we can say the relationship between teaching evaluation and beauty is positive.
#c)
#Given that residuals are centered around 0 and Normal QQ plot suggests points are mostly on the theoretical line, we can say the model is normally distributed. Not only that, we know that observations are independent.