Proposed Model: Lasso Regression (or Ridge, depending on your MSE results)
Based on our analysis, we recommend using Lasso regression to model the per capita crime rate (crim) in the Boston dataset. This decision is supported by the following:
Validation-Based Performance Using a train/test split and evaluating models using test mean squared error (MSE):
Lasso produced one of the lowest test MSE values
This suggests it generalizes well to unseen data
Model Simplicity and Interpretability Lasso eliminated unnecessary predictors by shrinking some coefficients to zero, yielding a sparse and interpretable model — especially helpful when we want to understand which variables drive crime rates.
Comparison with Other Methods While ridge regression and PCR had comparable test MSEs, they:
Retained all predictors (ridge), or
Used transformed variables (PCR), which are less interpretable
Cross-Validation Support The optimal penalty parameter ( 𝜆 λ) for lasso was selected using cross-validation, ensuring the model’s performance is based on unseen data, not the training set.
Conclusion
Lasso regression strikes the best balance between prediction accuracy, interpretability, and robustness to overfitting, based on its strong performance on the test set and the sparsity of the resulting model. It is therefore the preferred choice for modeling crime rate in this dataset.