Strongly, positively linearly associated.
In this scenario, what are the explanatory and response variables?
Explanatory: Calories
Response: Carbohydrates
Why might we want to fit a regression line to these data?
We might be on a low carbohydrate diet, but just have access to calorie information, so want to predict the former from the latter.
# y = mx + b
#
# b = R * (Sy/Sx)
.67 * (9.41/10.37)
## [1] 0.6079749
171.14 - 0.6079749*107.20
## [1] 105.9651
y = 0.6079749*x + 105.9651
For every 1 cm increase in shoulder girth, we predict a 0.6079749 increase in height. Since there is no shoulder girth of 0 cm, the intercept is just a theoretical prediction at this x value.
# R^2 = correlation^2
0.67^2
## [1] 0.4489
This is the amount of variability in height explained by shoulder girth.
0.6079749*100 + 105.9651
## [1] 166.7626
166.7626 - 160
## [1] 6.7626
Difference beteen the predicted and observed value.
0.6079749*56 + 105.9651
## [1] 140.0117
I’m not clear if this is outside of the range of the data used to create the model. If not, then reasonable to use. value above looks somewhat reasonable, but also likely that the linear relationship between shoulder girth and height breaks down for infants.
y = 4.034x - 0.357
Given a theoretical body weight of 0, heart weight would be -0.357. No meaningful translation to real data of the intercept.
For every 1 kg increase in body weight, predict 4.034 g increase in heart weight.
64.41% of variability in heart weight explained by body weight.
sqrt(64.41)
## [1] 8.025584
3.9983 = m*(-0.0883) - 4.010
(3.99 + 4.010)/-0.0883
## [1] -90.60023
Statistically significant evidence of weak negative association.
Linearity: slightly non-linear at extremes
Normality of residuals: yes
Constancy of variation: yes (slight fanning at smaller values of beauty)
Don’t understand 7.23d (non-linear?) and 7.40a (don’t we need correlation coefficient?)