Module 10 Homework

Exercise 7.1

Visualize the residuals. The scatterplots shown below each have a superimposed regression line. If we were to construct a residual plot (residuals versus x) for each, describe what those plots would look like.

  1. The residuals would be close to the x axis.

  2. The residuals would be far from the x axis early on and then get closer to it.

Exercise 7.3

Identify relationships, Part I. For each of the six plots, identify the strength of the relationship (e.g. weak, moderate, or strong) in the data and whether fitting a linear model would be reasonable.
a. We would not use linear regression on a curved line (yet)
b. The strength of the relationship is moderate in the data, fitting a linear model would be reasonable.
c. The strength of the relatiohsip is weak in the data but it is still there. d. We would not use linear regression on a curved line (yet)
e. The strength of relationship is strong in the data, fitting a linear model would be reasonable.
f. The strength of relationship is strong in the data, fitting a linear model would be reasonable.

Exercise 7.5

The two scatterplots below show the relationship between final and mid-semester exam grades recorded during several years for a Statistics course at a university.
(a) Based on these graphs, which of the two exams has the strongest correlation with the final exam grade? Explain.
Final exam B has the strongest correlation. The data points are clustered closer to the line and are arranged in a more linear fashion.
(b) Can you think of a reason why the correlation between the exam you chose in part (a) and the final exam is higher? Exam (b) has a stronger correlation because being the second exam, the students have a better hold on the material. Or, the people who are really bad at it dropped out.

Exercise 7.7 Match the correlation, Part I.

Match the calculated correlations to the corresponding scatterplot.
(a) R = -0.7: (4)
(b) R = 0.45: (3)
(c) R = 0.06: (1)
(d) R = 0.92: (2)

Exercise 7.10 Trees.

The scatterplots below show the relationship between height, diameter, and volume of timber in 31 felled black cherry trees. The diameter of the tree is measured 4.5 feet above the ground.

  1. Describe the relationship between volume and height of these trees.
    The relationship between the volume and height of the trees is weak but still linear.

  2. Describe the relationship between volume and diameter of these trees. The relationship between the volume and diameter of the trees is strong.

  3. Suppose you have height and diameter measurements for another black cherry tree. Which of these variables would be preferable to use to predict the volume of timber in this tree using a simple linear regression model? Explain your reasoning.
    It would be preferable to use the diameter variable to predict the volume of timber in the tree.

Exercise

7.15 Correlation, Part I. What would be the correlation between the ages of husbands and wives if men always married woman who were
(a) 3 years younger than themselves?
y=x-3 would be a positive line with a positive y intercept and would be perfectly linear

  1. 2 years older than themselves?
    y=x+2 would be a positive line with a negative y intercept and would be perfectly linear

  2. half as old as themselves?
    y=1/2x would be a steeper slope and would be perfectly linear.

They will all have a high correlation, close to a positive 1, because they are all described by a perfectly linear relationship.

Exercise 7.17

Tourism spending. The Association of Turkish Travel Agencies reports the number of foreign tourists visiting Turkey and tourist spending by year. The scatterplot below shows the relationship between these two variables along with the least squares fit.
(a) Describe the relationship between number of tourists and spending.
Positive correlation.

  1. What are the explanatory and response variables?
    Explanatory = number of tourists, response = spending

  2. Why might we want to fit a regression line to these data?
    To predict the amount of spending when the number of tourists is known.

  3. Do the data meet the conditions required for fitting a least squares line? In addition to the scatterplot, use the residual plot and histogram to answer this question.
    Initially, it looks linear and like it would meet the conditions, but when looking at the residual plot and the histogram, it no longer appears linear but rather a curved line. So no they do not meet the conditions even though it looked good at first glance.