DATA606_Home_Work

Graded: 7.24, 7.26, 7.30, 7.40

7.24 Nutrition at Starbucks, Part I

The scatterplot below shows the relationship between the number of calories and amount of carbohydrates (in grams) Starbucks food menu items contain.21 Since Starbucks only lists the number of calories on the display items, we are interested in predicting the amount of carbs a menu item has based on its calorie content.

Figure 7.24

Describe the relationship between number of calories and amount of carbohydrates (in grams) that Starbucks food menu items contain.

There is positive linear relationship between calorie and carbohydrates.

In this scenario, what are the explanatory and response variables?

x - axis is Calorie which is explanatory variable

y-axis is Carbohydrate which is response variable.

Why might we want to fit a regression line to these data?

We would like to predict the amount of carbs based on calorie count.

Do these data meet the conditions required for fitting a least squares line?

The data fit a linear plot, residuals appear nearly normal. we cannot achieve constant variability.

7.26 Body measurements, Part III.

Exercise 7.15 introduces data on shoulder girth and height of a group of individuals. The mean shoulder girth is 107.20 cm with a standard deviation of 10.37 cm. The mean height is 171.14 cm with a standard deviation of 9.41 cm. The correlation between height and shoulder girth is 0.67.

Figure 7.26

B1=round(0.67 * (9.41/10.35),4)
B0=round(B1 * -107.20 + 171.14 , 4)

Write the equation of the regression line for predicting height.

\[ \hat{y} = 105.8445 + 0.6091 * `shouldergirth` \]

Interpret the slope and the intercept in this context.

For additional cm of shoulder girth, there would be additional .6091 of height.We would expect a height of .6091 , If the shoulder girth of person is Zero.

Calculate R2 of the regression line for predicting height from shoulder girth, and interpret it in the context of the application.

R=0.67
R2=R*R

Value of R\(^2\) = 0.4489

A randomly selected student from your class has a shoulder girth of 100 cm. Predict the height of this student using the model.

R100=B0 + B1 * 100
R100

## [1] 166.7545

The student from part (d) is 160 cm tall. Calculate the residual, and explain what this residual means.

# Residual = Observed - Expected
Res=160 - 166.7545
Res

## [1] -6.7545

Model Overestimated the height of the individual.

A one year old has a shoulder girth of 56 cm. Would it be appropriate to use this linear model to predict the height of this child?

This can be achieved only through Extrapolation. We are making an assumption that we can achieve linear relationship in uncharted data

7.30 Cats, Part I.

The following regression output is for predicting the heart weight (in g) of cats from their body weight (in kg). The coe cients are estimated using a dataset of 144 domestic cats.

Figure 7.30_1

Figure 7.30_2

Write out the linear model.

B0=-0.357
B1=4.034

\[ \hat{y} = -0.357 + 4.034 * `bodyweight` \]

Interpret the intercept.

If cat’s body weight is zero, we will expect the heart to weight -0.357 grams

Interpret the slope.

For each additional kg of body weight, we can expect cat’s heaert to weigh additional 4.034 grams

Interpret R2.

R\(^2\) = 64.66%, which means 64.66% of observed data can be explained using the linear model in (a)

Calculate the correlation coe cient.

R2=.6466
corcof = sqrt(R2)
corcof

## [1] 0.8041144

7.40 Rate my professor.

Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evalu- ations as an indicator of course quality and teaching e↵ectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics, such as the physical ap- pearance of the instructor. Researchers at University of Texas, Austin collected data on teaching evaluation score (higher score means better) and standardized beauty score (a score of 0 means average, negative score means below average, and a positive score means above average) for a sample of 463 professors.24 The scatterplot below shows the relationship between these variables, and also provided is a regression output for predicting teaching evaluation score from beauty score.

Figure 7.40_1

Given that the average standardized beauty score is -0.0883 and average teaching evaluation score is 3.9983, calculate the slope. Alternatively, the slope may be computed using just the information provided in the model summary table.

B0=4.010
B1=4.13 * 0.0322

The value of slope is 0.132986

Do these data provide convincing evidence that the slope of the relationship between teaching evaluation and beauty is positive? Explain your reasoning.

On looking at scatter plot we can just see Scaters. There is no any upward or downward trend. p is shown as zero in summary table. It can be read as accepting the null hypothesis; there is no relation between teaching evaluation and beauty.

List the conditions required for linear regression and check if each one is satisfied for this model based on the following diagnostic plots.

Figure 7.40_2

Visual inspection of scatterplot suggest residuals are randomly scattered around horizontal axis. Linearity is achieved

Histogram is left skewed, so there could be some outliers. Residuals are nearly normal.

From scatterplot, the points has constant variance

These are independant observations, and shows a minor linear trend

DATA606_Home_Work_7

7.24 Nutrition at Starbucks, Part I

7.26 Body measurements, Part III.

7.30 Cats, Part I.

7.40 Rate my professor.