1. Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the linear model.

Null Hypothesis for Intercept (Intercept = 0): *The p-value for the intercept is less than 0.0001, providing strong evidence to reject the null hypothesis. This implies that there is a significant intercept, suggesting that the model is meaningful even when all predictors (TV, radio, and newspaper) are zero.

Null Hypothesis for TV (Coeffiecient for TV = 0): * The p-value for TV is less than 0.0001 which is lower than the significant value thus we can reject the null hypothesis as this clearly states that TV advertising has a good impact on the sales.

Null Hypothesis for Radio (Coeffiecient for Radio = 0): * The p-value for Radio is less than 0.0001 which is lower than the significant value thus we can reject the null hypothesis as this clearly states that Radio advertising has a good impact on the sales.

Null Hypothesis for Newspaper (Coeffiecient for Newspaper = 0): * The p-value for newspaper is 0.8599 which is higher than 0.05 thus there is no evidence to reject the null hypothesis as the newspaper advertising has no impact on the sales based on current data.

2. Carefully explain the differences between the KNN classifier and KNN regression methods.

KNN Classifier: The K-Nearest Neighbors (KNN) classifier is a versatile tool employed in classification tasks involving multiple classes. Its primary objective is to predict the class or category to which a new data point belongs from a predefined set of possible classes. For instance, in the context of identifying fruit types such as apples, oranges, or bananas based on specific features, KNN classifies the new data point into one of these discrete categories.

To achieve this classification, KNN relies on a distance metric, often using Euclidean or Manhattan distance measures. The distance between the new data point and its K-nearest neighbors is calculated to identify the most similar instances. The decision rule is established through majority voting among these neighbors. Each class receives a vote, and the class with the highest number of votes is assigned to the new data point.

In a multi-class setting, decision boundaries are integral to understanding the model’s predictions. Decision boundaries separate different regions associated with distinct classes, forming areas where one class is deemed more likely than others. For instance, in a three-class problem, these boundaries delineate regions where the likelihood of belonging to a particular class is higher.

To assess the performance of the KNN classifier, various evaluation metrics are employed, such as accuracy, precision, recall, and F1-score.

KNN-Regression:-

KNN regression is used for the regression task. Its main aim is to predict a contionous numerical value for a given input.Similar to classifier it uses distance metrics to find the K- nearest neighbour. It predict the values based on the features provided along with it. The predicted value will be mean or median value of the group that it belongs to. The evaluation metrics will be mean square error or R-squared error.

In summary, the primary distinction lies in the nature of the output (class label for classification vs. continuous value for regression) and the corresponding evaluation metrics. The core mechanism of finding the nearest neighbors based on distance measures remains common between the two methods.

3.Suppose we have a data set with five predictors, X1 = GPA, X2 = IQ, X3 = Level (1 for College and 0 for High School), X4 = Interac- tion between GPA and IQ, and X5 = Interaction between GPA and Level. The response is starting salary after graduation (in thousands of dollars). Suppose we use least squares to fit the model, and get βˆ0 = 50,βˆ1 = 20,βˆ2 = 0.07,βˆ3 = 35,βˆ4 = 0.01,βˆ5 = −10.

(a) Which answer is correct, and why?

Option (i): For a fixed value of IQ and GPA, high school graduates earn more, on average, than college graduates. *This would be true if the coefficient B3=35 is negative, but B3is positive.

Option (ii): For a fixed value of IQ and GPA, college graduates earn more, on average, than high school graduates. *This is generally true because the positive coefficient B3=35 implies that college graduates earn more than high school graduates.

Option (iii): For a fixed value of IQ and GPA, high school graduates earn more, on average, than college graduates provided that the GPA is high enough. *This would be true if the interaction term B5=−10 is significant and outweighs the positive effect of B3=35 for high GPAs. Without specific information about the GPAs, it’s challenging to determine the validity of this statement.

Option (iv): For a fixed value of IQ and GPA, college graduates earn more, on average, than high school graduates provided that the GPA is high enough. *This is consistent with the positive B3=35 coefficient, which suggests that college graduates generally earn more.

Thus option 4 is the correct answer.

(b) Predict the salary of a college graduate with IQ of 110 and a GPA of 4.0.

According to the formula given lets calculate the salary,

Salary= 50+ 20(40) + 0.07(110) + 35(1) + 0.01(4 * 110) - 10(4*1)

Salary = 133.14 thousand dollars.

(c) True or false: Since the coefficient for the GPA/IQ interaction term is very small, there is very little evidence of an interaction effect. Justify your answer.

True.

In this context, the GPA/IQ interaction term (B4) has a coefficient of 0.01. This means that, for each unit increase in the interaction term, the response variable (salary) is expected to increase by 0.01 units. Since this coefficient is small, it suggests a limited impact of the interaction between GPA and IQ on the starting salary.

If we also had the p-value we could have confidently answer this question, but in the absence of p-value we are answering this question only using the co-effiecient.