The p-values in Table 3.4 correspond to hypothesis tests that examine whether TV, radio, and newspaper advertising budgets have a significant impact on sales.
Based on the hypothesis testing results, businesses should prioritize TV and radio advertisements to enhance sales. Newspaper ads may not be the most efficient way to increase revenue, and resources should be allocated accordingly.
The K-Nearest Neighbors (KNN) method is a non-parametric machine learning algorithm used for both classification and regression tasks. However, its application differs based on whether the target variable is categorical (classification) or continuous (regression).
Predicting whether an email is spam (Yes/No) based on word frequency.
Predicting house prices based on size, number of rooms, and location.
Feature | KNN Classification | KNN Regression |
---|---|---|
Target Variable | Categorical (Classes) | Continuous (Numerical) |
Prediction Output | Majority class of k neighbors | Average of k neighbors’ values |
Example | Predicting if a customer will buy a product (Yes/No) | Predicting the price of a product |
Decision Rule | Voting among k nearest neighbors | Mean or weighted mean of k nearest values |
Classifying flowers as Setosa, Versicolor, or Virginica based on petal and sepal measurements.
KNN is a versatile algorithm that can be used for both classification and regression tasks. The key difference lies in how predictions are made: majority voting for classification and numerical averaging for regression.
We are given a regression model to predict starting salary after graduation (in thousands of dollars) based on the following predictors:
We analyze the given regression model:
\[ \hat{Y} = 50 + 20X_1 + 0.07X_2 + 35X_3 + 0.01X_4 - 10X_5 \]
where:
- \(X_1\) = GPA
- \(X_2\) = IQ
- \(X_3\) = Level (1
for College, 0 for High School)
- \(X_4 = X_1 \times
X_2\) (GPA × IQ Interaction)
- \(X_5 = X_1 \times
X_3\) (GPA × Level Interaction)
\[ \hat{Y}_{HS} = 50 + 20X_1 + 0.07X_2 + 0.01X_4 \]
\[ \hat{Y}_{College} = 50 + 20X_1 + 0.07X_2 + 35 + 0.01X_4 - 10X_1 \]
\[ = 85 + 10X_1 + 0.07X_2 + 0.01X_4 \]
For high school graduates to earn more than college graduates:
\[ 50 + 20X_1 + 0.07X_2 + 0.01X_4 > 85 + 10X_1 + 0.07X_2 + 0.01X_4 \]
Cancel common terms:
\[ 50 + 20X_1 > 85 + 10X_1 \]
\[ 10X_1 > 35 \]
\[ X_1 > 3.5 \]
Thus, high school graduates earn more than college graduates when GPA is greater than 3.5.
Given:
- \(X_1 = 4.0\)
(GPA)
- \(X_2 = 110\)
(IQ)
- \(X_3 = 1\) (College
graduate)
- \(X_4 = X_1 \times X_2 = 4.0 \times
110 = 440\)
- \(X_5 = X_1 \times X_3 = 4.0 \times
1 = 4.0\)
\[ \hat{Y} = 50 + (20 \times 4.0) + (0.07 \times 110) + (35 \times 1) + (0.01 \times 440) - (10 \times 4.0) \]
\[ = 50 + 80 + 7.7 + 35 + 4.4 - 40 \]
\[ = 137.1 \]
Predicted Salary: $137,100 (137.1 thousand dollars).
The coefficient for the GPA × IQ interaction term (\(X_4\)) is 0.01.
Thus, the statement is False, as a small coefficient alone does not indicate a weak interaction effect.
We are given the fitted values for a linear regression model without an intercept:
\[ \hat{y}_i = x_i \hat{\beta} \]
where:
\[ \hat{\beta} = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i'=1}^{n} x_{i'}^2} \]
Substituting the formula for \(\hat{\beta}\) into \(\hat{y}_i\):
\[ \hat{y}_i = x_i \times \frac{\sum_{i'=1}^{n} x_{i'} y_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \]
Rewriting:
\[ \hat{y}_i = \sum_{i'=1}^{n} \left( \frac{x_i x_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \right) y_{i'} \]
Comparing with the general form:
\[ \hat{y}_i = \sum_{i'=1}^{n} a_{i'} y_{i'} \]
we identify the weight coefficient:
\[ a_{i'} = \frac{x_i x_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \]
Thus, the coefficient \(a_{i'}\) is:
\[ a_{i'} = \frac{x_i x_{i'}}{\sum_{i'=1}^{n} x_{i'}^2} \]
This confirms that the fitted values in linear regression are linear combinations of the response values \(y_{i'}\), weighted by \(a_{i'}\).
Below is an R code snippet that demonstrates how to compute the fitted values in a simple linear regression without an intercept.
```r # Sample data x <- c(1, 2, 3, 4, 5) y <- c(2, 4, 6, 8, 10)
beta_hat <- sum(x * y) / sum(x^2)
y_hat <- x * beta_hat
a_i_prime <- outer(x, x, FUN = function(xi, xip) xi * xip / sum(x^2))
cat(“Estimated Beta (β̂):”, beta_hat, “”) cat(“Fitted Values (ŷ):”, y_hat, “”) print(“Coefficients a_i’:”) print(a_i_prime)