Ch 7 Regression

Graded: 7.24, 7.26, 7.30, 7.40

7.24 Nutrition at Starbucks pt. 1

There is a linear relationship between calroies and carbs
explanatory: calories, response: carbs
We would want to fit a regression line to this data to predict the number of carbs based on number of calories
While the relationship is linear, and the residuals form a nearly normal distribution, the residuals plot reveals that the residuals are not uniform and have large variability. As such, a linear model would not be adequate for modelling this data.

y0 <- 171.14
sdy <- 9.41
x0 <-107.2
sdx <- 10.37
r<-.67
(b1<- (sdy/sdx)*r)

## [1] 0.6079749

(b0 <- y0-b1*x0)

## [1] 105.9651

b1: As shoulder girth increases by 1cm, height increases .61cm.
b0: at shoulder girth=0, height is 106cm. Since it’s impossible for shoulder girth to be zero, this serves as a height adjustment on the line.
\(R^2 = .45\) This means that about 45% of the variability is explained by the model.

(r2 <- r^2)

## [1] 0.4489

height <- function(x){
    b0 + b1*x
}
height(100)

## [1] 166.7626

(e <- 160 - height(100))

## [1] -6.762581

The model would be inappropriate to predict the height of the child because the model is derived from a sample of adult men and women and does not apply to children.

\(heart\_weight= -.357 + 4.034 \cdot body\_weight\%\)
b0 = -.357 means that when body weight is zero, heart weight is -.357. Since it’s impossible to have negative weight, this serves as a height adjustment for the line.
b1 = 4.034 means that for every kg that weight increases, heart weight increases by 4.034g
\(R^2=64.66\%\) indicates that 64.66% of the variability of the heart weight variable is explained by the model.
Calculate correclation coefficient:

sqrt(.6466)

## [1] 0.8041144

b0 <- 4.010
x0 <- -.0883
y0 <- 3.9983
(b1 <- (y0-b0)/x0)

## [1] 0.1325028

Since \(\beta_1\) has a t-value of 4.13 and corresponding p-value of zero, we can reject the null hypothesis of \(\beta_1=0\) and conclude that the data provides convincing evidence of a relationship between teaching evaluation and beauty.
The conditions for linear regression are as follows:

Linearity - data appears to be linear
Nearly Normal Residuals - data appears to have nearly normal residuals
Constant Variability - the data appears to have a large degree of variability
Independent Observations - The observations may not be independent, as scores given to professors may be affected by students’ peers.