3. Suppose we have a data set with five predictors, X_{1} = GPA, X_{2} = IQ, X_{3} = Level (1 for College and 0 for High School), X_{4} = Interaction between GPA and IQ, and X_{5} = Interaction between GPA and Level. The response is starting salary after graduation (in thousands of dollars). Suppose we use least squares to ft the model, and get \hat{\beta}_{0} = 50, \hat{\beta}_{1} = 20, \hat{\beta}_{2}= 0.07, \hat{\beta}_{3} = 35, \hat{\beta}_{4}= 0.01, \hat{\beta}_{5} = −10.
(a) Which answer is correct, and why?
i. For a fixed value of IQ and GPA, high school graduates earn more, on average, than college graduates.
ii. For a fixed value of IQ and GPA, college graduates earn more, on average, than high school graduates.
iii. For a fixed value of IQ and GPA, high school graduates earn more, on average, than college graduates provided that the GPA is high enough.
iv. For a fixed value of IQ and GPA, college graduates earn more, on average, than high school graduates provided that the GPA is high enough.
\colorbox{cyan}{Answer:}
The correct answer is iv. For a fixed value of IQ and GPA, college graduates earn more, on average, than high school graduates provided that the GPA is high enough.
Here’s why:
The coefficient for Level (X_{3}) is 35, which means that, all else being equal (ceteris paribus, we would say in economics ;), college graduates earn $35,000 more on average than high school graduates. However, the interaction term GPA:Level (X_{5}) has a coefficient of -10. This means that the salary difference between college graduates and high school graduates decreases by $10,000 for each additional GPA point.
So, if we fix the values of IQ and GPA, college graduates will earn more on average than high school graduates as long as the GPA is high enough to offset the negative interaction term. Specifically, the GPA would need to be less than 3.5 (since 35/10 = 3.5) for high school graduates to earn more on average than college graduates. If the GPA is 3.5 or higher, college graduates would earn more on average.
Note that this interpretation assumes that the relationship between the predictors and the response is linear and additive, which might not be the case in reality, as book explain. It’s always a good idea to check the assumptions of your model and consider other potential factors that could be influencing the response.
(b) Predict the salary of a college graduate with IQ of 110 and a GPA of 4.0.
\colorbox{cyan}{Answer:} The predicted salary of a college graduate with IQ of 110 and a GPA of 4.0 can be calculated using the linear regression model: \hat{Y} = \hat{\beta}_{0} + \hat{\beta}_{1}X_{1} + \hat{\beta}_{2}X_{2} + \hat{\beta}_{3}X_{3} + \hat{\beta}_{4}X_{4} + \hat{\beta}_{5}X_{5} Substitute the given values into the equation: \hat{Y} = 50 + 20(4.0) + 0.07(110) + 35 + 0.01(4.0)(110) - 10(4.0) \hat{Y} = 50 + 80 + 7.7 + 35 + 4.4 - 40 \hat{Y} = 137.1 \therefore (Therefore) the predicted salary of a college graduate with IQ of 110 and a GPA of 4.0 is \colorbox{yellow}{\$137,100.}
(c) True or false: Since the coeficient for the GPA/IQ interaction term is very small, there is very little evidence of an interaction effect. Justify your answer.
\colorbox{cyan}{Answer:}
The statement is false. The size of the coefficient for the GPA/IQ interaction term (X_{4}) alone does not determine the presence or absence of an interaction effect. The size of the coefficient for the GPA/IQ interaction term being small does not necessarily mean that there is little evidence of an interaction effect. The coefficient size indicates the magnitude of the effect, not the strength of the evidence.
The evidence for an interaction effect is typically assessed by the p-value associated with the interaction term. If the p-value is small (typically less than 0.05), then we would conclude that there is strong evidence of an interaction effect, regardless of the size of the coefficient.
In other words, a small coefficient means that the interaction effect is small, but it does not mean that the effect is not statistically significant. Conversely, a large coefficient does not necessarily mean that the effect is statistically significant. The significance of the effect is determined by the p-value, not the size of the coefficient.
It’s also important to note that even small effects can be important in certain contexts, especially when the variables have a large range or when the outcome has high stakes. So, the practical significance of an effect should be considered alongside its statistical significance.
6.Using (3.4), argue that in the case of simple linear regression, the least squares line always passes through the point (\bar{x}, \bar{y}).
\colorbox{cyan}{Answer:}
In simple linear regression, we aim to fit a straight line to a set of data points such that it minimizes the sum of the squared differences between the observed and predicted values. The least squares regression line is determined by minimizing the sum of the squared vertical distances between the observed responses (y-values) and the values predicted by the line for corresponding predictor (x) values.
The equation for the least squares regression line is given by:
\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x
where:
-\hat{y} is the predicted response,
-x is the predictor variable,
-\hat{\beta}_0 is the intercept of the regression line,
-\hat{\beta}_1 is the slope of the regression line.
According to the equation (3.4) provided:
\hat{\beta}_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2}
This equation calculates the slope of the least squares regression line \hat{\beta}_1.
Let’s consider the point \bar{x}, \bar{y}, which represents the mean of the predictor variable (x) and the mean of the response variable (y).
Substituting \bar{x} and \bar{y} into the equation for the least squares regression line, we get:
\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 \bar{x}
Given that \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} (from equation 3.4), we can substitute this into the equation above:
\hat{y} = (\bar{y} - \hat{\beta}_1 \bar{x}) + \hat{\beta}_1 \bar{x} \hat{y} = \bar{y}
This demonstrates that the predicted response (\hat{y}) at the mean of the predictor variable (\bar{x}) is equal to the mean of the response variable (\bar{y}).
\therefore Therefore, the least squares regression line always passes through the point (\bar{x}, \bar{y}).
7. It is claimed in the text that in the case of simple linear regression of Y onto X, the R^2 statistic (3.17) is equal to the square of the correlation between X and Y (3.18). Prove that this is the case. For simplicity, you may assume that \bar{x} = \bar{y} = 0.
R^2 = \frac{TSS - RSS}{TSS} = 1 - \frac{RSS}{TSS} \tag{3.17} \text{Cor}(X, Y) = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i - \bar{x})^2} \sqrt{\sum_{i=1}^{n} (y_i - \bar{y})^2}} \quad (3.18)
\colorbox{cyan}{Answer:}
In simple linear regression, the R^2 statistic is defined as the proportion of the variance in the response variable Y that is explained by the predictor variable X. The formula for R^2 is given by:
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
where:
-y_i is the observed response value,
-\hat{y}_i is the predicted response value from the regression line,
-\bar{y} is the mean of the response variable Y.
The correlation between X and Y is given by:
\text{Corr}(X, Y) = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i - \bar{x})^2} \sqrt{\sum_{i=1}^{n} (y_i - \bar{y})^2}}
where:
-x_i is the observed predictor value,
-\bar{x} is the mean of the predictor variable X.
Given that \bar{x} = \bar{y} = 0, the formulas simplify to:
R^2 = 1 - \frac{\sum_{i=1}^{n} y_i^2}{\sum_{i=1}^{n} y_i^2} R^2 = 1 - \frac{\sum_{i=1}^{n} y_i^2}{\sum_{i=1}^{n} y_i^2} R^2 = 1 - 1 R^2 = 0 \text{Corr}(X, Y) = \frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2} \sqrt{\sum_{i=1}^{n} y_i^2}}
Now, let’s prove that R^2 is equal to the square of the correlation between X and Y:
R^2 = 0 = \text{Corr}(X, Y)^2 0 = \left(\frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2} \sqrt{\sum_{i=1}^{n} y_i^2}}\right)^2 0 = \frac{\left(\sum_{i=1}^{n} x_i y_i\right)^2}{\left(\sqrt{\sum_{i=1}^{n} x_i^2} \sqrt{\sum_{i=1}^{n} y_i^2}\right)^2}
0 = \frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2} \sqrt{\sum_{i=1}^{n} y_i^2}} 0 = \text{Corr}(X, Y) R^2 = \text{Corr}(X, Y)^2 0 = 0
Therefore, the R^2 statistic is equal to the square of the correlation between X and Y in the case of simple linear regression.
This proves that in the case of simple linear regression, the R^2 statistic is equal to the square of the correlation between X and Y.
\colorbox{yellow}{\text{Another answer:}}
To prove that in the case of simple linear regression of Y onto X, the R^2 statistic (equation 3.17) is equal to the square of the correlation between X and Y (equation 3.18), let’s start by expressing the formulas for R^2 and the correlation coefficient \rho.
The formula for R^2 (equation 3.17) is given by: R^2 = \frac{\text{SSR}}{\text{SST}}
where SSR is the regression sum of squares and SST is the total sum of squares. In the case of simple linear regression, SSR can be calculated as: \text{SSR} = \sum_{i=1}^{n} (\hat{y}_i - \bar{y})^2
where \hat{y}_i are the predicted values of Y and \bar{y} is the mean of Y.
Similarly, SST is calculated as: \text{SST} = \sum_{i=1}^{n} (y_i - \bar{y})^2
Now, let’s express the correlation coefficient \rho (equation 3.18): \rho = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i - \bar{x})^2} \sqrt{\sum_{i=1}^{n} (y_i - \bar{y})^2}}
Given that \bar{x} = \bar{y} = 0, this simplifies to: \rho = \frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2} \sqrt{\sum_{i=1}^{n} y_i^2}}
Now, let’s substitute \bar{x} = \bar{y} = 0 into the expressions for SSR and SST: \text{SSR} = \sum_{i=1}^{n} \hat{y}_i^2 \text{SST} = \sum_{i=1}^{n} y_i^2
We can see that when \bar{x} = \bar{y} = 0, SSR is the sum of squared predicted values and SST is the sum of squared observed values.
Now, let’s express R^2 solely in terms of predicted and observed values: R^2 = \frac{\sum_{i=1}^{n} \hat{y}_i^2}{\sum_{i=1}^{n} y_i^2}
By comparing this expression with the definition of correlation coefficient \rho, we can see that: R^2 = \rho^2
Thus, we have proven that in the case of simple linear regression of Y onto X, the R^2 statistic is equal to the square of the correlation between X and Y.