The scatterplot below shows the relationship between the number of calories and amount of carbohydrates (in grams) Starbucks food menu items contain.
Since Starbucks only lists the number of calories on the display items, we are interested in predicting the amount of carbs a menu item has based on its calorie content.
As the Calorie count (Energy) increases so does carb amount, with a lot of scatter to the data. Carbs are one of three sources of energy in food the others being fats and proteins.
Explanatory - Calories Response - Carbs.
In physics, we would do the opposite as energy is the effect (dependent) and carbs are a cause (independent).
Carbs include sugars, in which increased consumption is implicated in metabolic disorders such as obesity and diabetes. Customers who are aware of this will want to know how many Calories on average are coming from carbs.
No. The residuals increase in range as the calorie count increase this means there is not constant variance, also there is a left skew to the data so the residuals are not normally distributed.
Exercise 7.15 introduces data on shoulder girth and height of a group of individuals. The mean shoulder girth is 107.20 cm with a standard deviation of 10.37 cm. The mean height is 171.14 cm with a standard deviation of 9.41 cm. The correlation between height and shoulder girth is 0.67.
\[ \beta_1 = r\frac{s_y}{s_x} \\ \beta_1 = r\frac{s_h}{s_g} \\ \beta_1 = 0.67\frac{9.41}{10.37} \\ \beta_1 = 0.6079749 \\ \beta_0 = \bar{y} - \beta_1 \bar{x} \\ \beta_0 = 171.14 cm- 0.6079749 * 107.20cm \\ \beta_0 = 105.9651 cm \\ y = 105.9651 cm + 0.6079749*x \]
The slope means that for every 1 cm they increase in shoulder girth, their height increases by 0.6079749 cm on average.
The intercept means that if a person were to have 0 cm in shoulder girth they would be 105.9651 cm tall on average. Note that is is not physically possible, so practically it means that people are taller than they are wide at the shoulder.
\[ R = 0.67 \\ R^2 = 0.4489 \]
44.89% of the model is explained by shoulder girth.
\[ y = 105.9651 cm + 0.6079749*x \\ y = 105.9651 cm + 0.6079749*100cm \\ y = 166.7626 cm \]
\[ e = y_i - \hat{y} \\ e = 160 cm - 166.7626 cm \\ e = 6.7626 cm \]
This means that there is a 6.7626cm difference between what our model predicted and what we actually measured.
No, these data were taken on a sample of adults and can only be used to make predictions on an adult population. Children do not belong to the population from which the sample was collected.
The following regression output is for predicting the heart weight (in g) of cats from their body weight (in kg). The coeffcients are estimated using a data set of 144 domestic cats.
Parameter | Estimate | Std. Error | t value | Pr(>|t|) |
---|---|---|---|---|
(Intercept) | -0.357 | 0.692 | -0.515 | 0.607 |
body wt | 4.034 | 0.250 | 16.119 | 0.000 |
s=1.452 R\(^2\)=64.66% R\(^2_{adj}\)=64.41%
Note that the people who made the measurements here made a physics error. Grams and Kilograms measure mass. Weight is the force of gravity applied to an object by Earth, and is measured in Newtons or dynes. Weight increases as mass increases by Newton’s universal law of gravitation, but they are not equivalent.
\(y = -0.357g+4.034*x\)
Where y is heart mass in grams, and x is body mass in Kg.
The intercept here means error in the data has caused an off-set between the body mass and heart mass.
Mass as defined by physics measures an objects inertia and how strongly it interacts with gravity. In both cases, material objects like cats and their hearts must have mass greater than 0. A literal interpretation of the slope would be non-physical.
For every 1 kg a cat increases in mass, their hearts mass increases by 4.034 grams.
64.66% of our model is explained by the cat’s body mass.
\[ R = \sqrt(0.6466) \\ R = 0.8041144 \]
Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics, such as the physical appearance of the instructor. Researchers at University of Texas, Austin collected data on teaching evaluation score (higher score means better) and standardized beauty score (a score of 0 means average, negative score means below average, and a positive score means above average) for a sample of 463 professors.
The scatterplot below shows the relationship between these variables, and also provided is a regression output for predicting teaching evaluation score from beauty score.
Parameter | Estimate | Std. Error | t | value Pr(>|t|) |
---|---|---|---|---|
(Intercept) | 4.010 | 0.0255 | 157.21 | 0.0000 |
beauty | \(\boxed{\space \space }\) | 0.0322 | 4.13 | 0.0000 |
\[ \beta_0 = \bar{y} - \beta_1 \bar{x} \\ \beta_1 \bar{x}= \bar{y}-\beta_0 \\ \beta_1 = \frac{\bar{y}-\beta_0}{\bar{x}} \\ \beta_1 = \frac{3.9983-4.010}{-0.0883} \\ \beta_1 = 0.1325028 \] (b) Do these data provide convincing evidence that the slope of the relationship between teaching evaluation and beauty is positive? Explain your reasoning.
The p-values of the slope and intercept are both <0.05, so the model is statistically significant. The slope seems large enough given the ration of the range of the variables is \((5.0-2.0)/(2.0--1.5)= 0.8571429\). I would say that this means that there is practical significance that appearance has an effect on professor evals.
Constant variance - The residuals look randomly scattered about 0 when plotted in order of collection. Normally Distributed Residuals - Data has a slight left skew but looks fairly Normal in the QQ plot and Histogram. Linearity - The residuals look randomly scattered about 0 when plotted vs beauty.
I would say that the conditions for linear regression are satisfied.