Chapter 2: Exercise 10 — Boston Housing Data

(a) Load the Boston data and describe it

data("Boston")
dim(Boston)

## [1] 506  13

head(Boston)

##      crim zn indus chas   nox    rm  age    dis rad tax ptratio lstat medv
## 1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3  4.98 24.0
## 2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8  9.14 21.6
## 3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8  4.03 34.7
## 4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7  2.94 33.4
## 5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7  5.33 36.2
## 6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7  5.21 28.7

?Boston

(b) Pairwise scatterplots of the predictors

pairs(Boston, main = "Pairwise Scatterplots of Boston Predictors")

#Findings
#The scatterplots reveal several notable relationships:

#lstat and medv have a strong negative correlation—higher lower-status populations are linked to lower home values.

#rm and medv show a strong positive relationship—more rooms generally mean higher home values.

#nox and dis are negatively correlated, suggesting higher pollution in areas closer to employment centers.

#Variables like rad, tax, and ptratio show clustering, hinting at categorical-like behavior.

#Some relationships (e.g., crim vs. medv) are nonlinear but still show general trends.

(c) Correlations with per capita crime rate (crim)

cor(Boston$crim, Boston[-which(names(Boston) == "crim")])

##              zn     indus        chas       nox         rm       age        dis
## [1,] -0.2004692 0.4065834 -0.05589158 0.4209717 -0.2192467 0.3527343 -0.3796701
##            rad       tax   ptratio     lstat       medv
## [1,] 0.6255051 0.5827643 0.2899456 0.4556215 -0.3883046

(d) Outliers and predictor ranges

summary(Boston$crim)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00632  0.08204  0.25651  3.61352  3.67708 88.97620

summary(Boston$tax)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   187.0   279.0   330.0   408.2   666.0   711.0

summary(Boston$ptratio)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   12.60   17.40   19.05   18.46   20.20   22.00

(e) Number of tracts bounding the Charles River

sum(Boston$chas == 1)

## [1] 35

(f) Median pupil-teacher ratio

median(Boston$ptratio)

## [1] 19.05

(g) Tract with lowest median home value

min_index <- which.min(Boston$medv)
Boston[min_index, ]

##        crim zn indus chas   nox    rm age    dis rad tax ptratio lstat medv
## 399 38.3518  0  18.1    0 0.693 5.453 100 1.4896  24 666    20.2 30.59    5

(h) Number of tracts with more than 7 or 8 rooms per dwelling

sum(Boston$rm > 7)

## [1] 64

sum(Boston$rm > 8)

## [1] 13

Boston[Boston$rm > 8, ]

##        crim zn indus chas    nox    rm  age    dis rad tax ptratio lstat medv
## 98  0.12083  0  2.89    0 0.4450 8.069 76.0 3.4952   2 276    18.0  4.21 38.7
## 164 1.51902  0 19.58    1 0.6050 8.375 93.9 2.1620   5 403    14.7  3.32 50.0
## 205 0.02009 95  2.68    0 0.4161 8.034 31.9 5.1180   4 224    14.7  2.88 50.0
## 225 0.31533  0  6.20    0 0.5040 8.266 78.3 2.8944   8 307    17.4  4.14 44.8
## 226 0.52693  0  6.20    0 0.5040 8.725 83.0 2.8944   8 307    17.4  4.63 50.0
## 227 0.38214  0  6.20    0 0.5040 8.040 86.5 3.2157   8 307    17.4  3.13 37.6
## 233 0.57529  0  6.20    0 0.5070 8.337 73.3 3.8384   8 307    17.4  2.47 41.7
## 234 0.33147  0  6.20    0 0.5070 8.247 70.4 3.6519   8 307    17.4  3.95 48.3
## 254 0.36894 22  5.86    0 0.4310 8.259  8.4 8.9067   7 330    19.1  3.54 42.8
## 258 0.61154 20  3.97    0 0.6470 8.704 86.9 1.8010   5 264    13.0  5.12 50.0
## 263 0.52014 20  3.97    0 0.6470 8.398 91.5 2.2885   5 264    13.0  5.91 48.8
## 268 0.57834 20  3.97    0 0.5750 8.297 67.0 2.4216   5 264    13.0  7.44 50.0
## 365 3.47428  0 18.10    1 0.7180 8.780 82.9 1.9047  24 666    20.2  5.29 21.9

Chapter 3: Exercise 2 — KNN Classifier vs. Regression

## KNN Classifier:
# - Used for classification problems (categorical response).
# - Assigns the most frequent class among the k nearest neighbors.

## KNN Regression:
# - Used for regression problems (continuous response).
# - Predicts the average response value of the k nearest neighbors.

# Key differences:
# - Classifier outputs class label; regression outputs a numeric value.
# - Classifier uses majority vote; regression uses averaging.

Chapter 3: Exercise 10 — Carseats Regression

(a) Fit a multiple regression model

data("Carseats")
model_full <- lm(Sales ~ Price + Urban + US, data = Carseats)
summary(model_full)

## 
## Call:
## lm(formula = Sales ~ Price + Urban + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9206 -1.6220 -0.0564  1.5786  7.0581 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.043469   0.651012  20.036  < 2e-16 ***
## Price       -0.054459   0.005242 -10.389  < 2e-16 ***
## UrbanYes    -0.021916   0.271650  -0.081    0.936    
## USYes        1.200573   0.259042   4.635 4.86e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2335 
## F-statistic: 41.52 on 3 and 396 DF,  p-value: < 2.2e-16

(b) Interpretation of coefficients

# Price: Sales decrease as price increases.
# UrbanYes: Difference in Sales for urban vs. non-urban stores.
# USYes: Difference in Sales for stores in US vs. elsewhere.

(c) Regression equation

# Sales = β0 + β1 * Price + β2 * UrbanYes + β3 * USYes + ε
# Use values from summary(model_full)

(d) Hypothesis tests

# Check p-values from model summary.
# Reject H0 if p-value < 0.05.

(e) Fit reduced model

model_reduced <- lm(Sales ~ Price + US, data = Carseats)
summary(model_reduced)

## 
## Call:
## lm(formula = Sales ~ Price + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9269 -1.6286 -0.0574  1.5766  7.0515 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.03079    0.63098  20.652  < 2e-16 ***
## Price       -0.05448    0.00523 -10.416  < 2e-16 ***
## USYes        1.19964    0.25846   4.641 4.71e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2354 
## F-statistic: 62.43 on 2 and 397 DF,  p-value: < 2.2e-16

(f) Compare model fits

summary(model_full)$adj.r.squared

## [1] 0.2335123

summary(model_reduced)$adj.r.squared

## [1] 0.2354305

(g) 95% Confidence intervals

confint(model_reduced)

##                   2.5 %      97.5 %
## (Intercept) 11.79032020 14.27126531
## Price       -0.06475984 -0.04419543
## USYes        0.69151957  1.70776632

(h) Outliers and leverage

par(mfrow = c(2, 2))
plot(model_reduced)

Chapter 4: Exercise 12 — Logistic Regression vs. Softmax

(a) Log odds in your model

# log(p_orange / (1 - p_orange)) = β0 + β1 * x

(b) Log odds in your friend’s model

# log(p_orange / p_apple) = (α_orange0 - α_apple0) + (α_orange1 - α_apple1) * x

(c) Match coefficients: β0 = 2, β1 = -1

# β0 = α_orange0 - α_apple0 = 2
# β1 = α_orange1 - α_apple1 = -1
# Example solution: α_orange0 = 2, α_orange1 = -1, α_apple0 = 0, α_apple1 = 0

(d) Friend’s softmax model estimates

# α_orange0 = 1.2, α_orange1 = -2
# α_apple0 = 3, α_apple1 = 0.6
# Then:
# β0 = 1.2 - 3 = -1.8
# β1 = -2 - 0.6 = -2.6

(e) Predictions comparison

# Both models give same class prediction if log-odds have same sign.
# Since β0 + β1 * x = (α_orange0 - α_apple0) + (α_orange1 - α_apple1) * x,
# the decision boundary is the same.
# Therefore, 100% agreement expected across test set.

Exercises: Chapters 2, 3, and 4 (Midterm)

Msibi Sandziso_111021084

2025-04-18

Chapter 2: Exercise 10 — Boston Housing Data

(a) Load the Boston data and describe it

(b) Pairwise scatterplots of the predictors

(c) Correlations with per capita crime rate (crim)

(d) Outliers and predictor ranges

(e) Number of tracts bounding the Charles River

(f) Median pupil-teacher ratio

(g) Tract with lowest median home value

(h) Number of tracts with more than 7 or 8 rooms per dwelling

Chapter 3: Exercise 2 — KNN Classifier vs. Regression

Chapter 3: Exercise 10 — Carseats Regression

(a) Fit a multiple regression model

(b) Interpretation of coefficients

(c) Regression equation

(d) Hypothesis tests

(e) Fit reduced model

(f) Compare model fits

(g) 95% Confidence intervals

(h) Outliers and leverage

Chapter 4: Exercise 12 — Logistic Regression vs. Softmax

(a) Log odds in your model

(b) Log odds in your friend’s model

(c) Match coefficients: β0 = 2, β1 = -1

(d) Friend’s softmax model estimates

(e) Predictions comparison