Statistical Learning Exercise

Q1: Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coeﬀicients of the linear model.

The null hypotheses for the p-values in Table 3.4 correspond to testing whether each predictor’s coefficient is equal to zero in a multiple linear regression model predicting sales from TV, radio, and newspaper advertising budgets.

Null Hypothesis (H₀): The predictor has no effect on sales (coefficient = 0).
Alternative Hypothesis (H₁): The predictor has a significant effect on sales (coefficient ≠ 0).

Interpretation of p-values:

TV (p < 0.0001): Strong evidence that TV advertising significantly impacts sales.
Radio (p < 0.0001): Strong evidence that radio advertising significantly impacts sales.
Newspaper (p = 0.8599): No evidence that newspaper advertising impacts sales.

Interpretation:

Investing in TV and radio advertising leads to increased sales.
Newspaper advertising does not significantly impact sales and may not be a cost-effective strategy.

Q2: Carefully explain the differences between the KNN classifier and KNN regression methods.

KNN Classifier:

Used for categorical responses.
Assigns a class based on a majority vote among the K-nearest neighbors.
Decision boundaries are non-linear and depend on the distribution of data.

KNN Regression:

Used for continuous responses.
Predicts a value by averaging the values of the K-nearest neighbors.
More sensitive to outliers since it considers numerical averages.

Key Differences:

Feature	KNN Classification	KNN Regression
Response Type	Categorical	Continuous
Decision Rule	Majority Voting	Averaging
Output	Class Label	Numerical Value
Loss Function	Classification Error	Mean Squared Error

Q3: Suppose we have a data set with five predictors, X1 = GPA, X2 = IQ, X3 = Level (1 for College and 0 for High School), X4 = Interaction between GPA and IQ, and X5 = Interaction between GPA and Level. The response is starting salary after graduation (in thousands of dollars). Suppose we use least squares to fit the model, and get βˆ0 = 50, βˆ1 = 20, βˆ2 = 0.07, βˆ3 = 35, βˆ4 = 0.01, βˆ5 = −10.

(a) Which answer is correct, and why?

Given the model: \[ \hat{Y} = 50 + 20X_1 + 0.07X_2 + 35X_3 + 0.01X_1X_2 - 10X_1X_3 \]

We compare salaries for college and high school graduates while keeping IQ and GPA fixed.

For high school graduates (X₃ = 0): \[ \hat{Y}_{HS} = 50 + 20X_1 + 0.07X_2 + 0 + 0.01X_1X_2 - 10(0) \]
For college graduates (X₃ = 1): \[ \hat{Y}_{College} = 50 + 20X_1 + 0.07X_2 + 35 + 0.01X_1X_2 - 10X_1 \]
Difference: \[ \hat{Y}_{College} - \hat{Y}_{HS} = 35 - 10X_1 \]
If GPA is high enough, \(-10X_1\) dominates, making high school graduates earn more.
If GPA is low, college graduates earn more.

Option (iii) High school graduates earn more provided GPA is high enough, is correct.

(b) Predict the salary of a college graduate with IQ of 110 and a GPA of 4.0.

GPA <- 4.0
IQ <- 110
Level <- 1 # College Graduate
predicted_salary <- 50 + 20*GPA + 0.07*IQ + 35*Level + 0.01*GPA*IQ - 10*GPA*Level
predicted_salary

## [1] 137.1

(c) True or false: Since the coeﬀicient for the GPA/IQ interaction term is very small, there is very little evidence of an interaction effect. Justify your answer.

False, the size of the coefficient alone does not determine significance. We need a hypothesis test to confirm.

Q5: Consider the fitted values that result from performing linear regression without an intercept. In this setting, the ith fitted value takes the form as shown in the question. What is ai′?

Given: \[ \hat{y}_i = x_i \hat{\beta} \] where \[ \hat{\beta} = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i=1}^{n} x_i^2} \]

Rewrite the equation: \[ \hat{y}_i = x_i \cdot \frac{\sum_{i'=1}^{n} x_{i'} y_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \] Simplify it to: \[ \hat{y}_i = \sum_{i'=1}^{n} a_{i'} y_{i'} \] where: \[ a_{i'} = \frac{x_i x_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \]

Interpretation:

The fitted values are linear combinations of the response values.
The weights a_{i’} depend on the predictor values and normalize based on their squared sum.

Statistical Learning Exercise - Chapter 3

Saransh Gupta

02/20/2025

Q1: Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coeﬀicients of the linear model.

Interpretation of p-values:

Interpretation:

Q2: Carefully explain the differences between the KNN classifier and KNN regression methods.

KNN Classifier:

KNN Regression:

Key Differences:

(a) Which answer is correct, and why?

(b) Predict the salary of a college graduate with IQ of 110 and a GPA of 4.0.

(c) True or false: Since the coeﬀicient for the GPA/IQ interaction term is very small, there is very little evidence of an interaction effect. Justify your answer.

Q5: Consider the fitted values that result from performing linear regression without an intercept. In this setting, the ith fitted value takes the form as shown in the question. What is ai′?

Interpretation: