Nutrition at Starbucks, Part I. The scatterplot below shows the relationship between the number of calories and amount of carbohydrates (in grams) Starbucks food menu items con- tain.21 Since Starbucks only lists the number of calories on the display items, we are interested in predicting the amount of carbs a menu item has based on its calorie content.
There appears to be a linear relationship between increase in calories and carbs.
The dependent variable is carbs, as it is on the y axis.
Fitting a regression line will allows us to show a correlation and predict the number of carbs for the amount of calories given.
The residuals appear to follow a nearly normal condition. However, the constant variability appears to fan out with much larger values of calories.
Exercise 7.15 introduces data on shoulder girth and height of a group of individuals. The mean shoulder girth is 107.20 cm with a standard deviation of 10.37 cm. The mean height is 171.14 cm with a standard deviation of 9.41 cm. The correlation between height and shoulder girth is 0.67.
We start with the following point-slope equation: y - 171.14 = b1(x -107.20)
Calculating R using the values below, we get the following for B1 = 0.61
B1 <- (9.41/10.37)*(0.61)
After solving for y, we get the following: y = 105.748 + 0.61X
R <- 0.67
x <- 107.20
y <- 171.14
sd.x <- 10.37
sd.y <- 9.41
b1 <- (sd.y/sd.x)*R
**The slope is b1 <- (sd.y/sd.x)*R or 0.61**
R^2 comes out to 0.61^2 or 0.3721. In this case, this model can account for 37% of the variation
We use the following formula and plug in for X y = 105.748 + 0.61X
We get 166.748
y <- 105.748 + 0.61*(100)
y
## [1] 166.748
we use the following formula \({ e }_{ i }={ y }_{ i }-{ \hat { y } }_{ i }\)
ei = 160-166.748 ei = -6.478 In this case, our model overestimated and therefor the residual is negative.
This model would not make sense for a small child. The height would most liekly icnrease expentionally vs any weight increase due to the child’s age.
The following regression output is for predicting the heart weight (in g) of cats from their body weight (in kg). The coe
m<- matrix(c(-0.357, 0.692,-0.515,0.607, 4.034,0.250,16.119,0.000), nrow=2, byrow=TRUE)
colnames(m) <- c("Estimate", "Std. Error", "t value", "Pr(>|t|)")
rownames(m) <- c("(Intercept)", "body wt")
m
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.357 0.692 -0.515 0.607
## body wt 4.034 0.250 16.119 0.000
The linear model is Y=-0.357 + 4.034X or cat.heart.wt = -0.357 + 4.034(cat.body.wt)
For each increase in body weight, we would multiply 4.034 and substract our intercept. (-0.357)
The slope is given by 4.034
R2 is given as 64.66% which means that this model accounts for 64% of the variation.
the correlation coeffcient is the sqrt of R, which is 0.8041144. This means that there is a positive correlation between the body weight of a cat and the weight of a cat’s height.
Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics, such as the physical ap4.13*0.0322pearance of the instructor. Researchers at University of Texas, Austin collected data on teaching evaluation score (higher score means better) and standardized beauty score (a score of 0 means average, negative score means below average, and a positive score means above average) for a sample of 463 professors.24 The scatterplot below shows the relationship between these variables, and also provided is a regression output for predicting teaching evaluation score from beauty score.
Using the following formula \(T=\frac { estimate\quad -\quad null\quad value }{ SE }\), we can substitute the values for estimate and SE based off the given table and get the following: \(4.13=\frac { { \beta }_{ 1 }\quad -\quad 0 }{ 0.0322 }\) or 0.133
BAsed on the scatter plot there appears to be a mostly positive relationship and several points that pull the regression line “up”, however there so many points that are not scattered around a straight line. Therefore, it is relatively difficult to tell.
One of the conditions for linear regression is that the residuals follow a nearly normal pattern and based on the graph provided, it seems like this condition is satisfied. However one of the other conditions is that there not be any curvature in the points. Towards the end of this graph, it appears that the points curve away from a linear model.