Study Guide: Understanding Overfitting and Regularization in Machine Learning (integrating theory, mathematical foundations, and practical examples to regularization techniques to help understand their impact on overfitting in machine learning.


1. Overfitting

  • Definition: Overfitting occurs when a model is overly tailored to the training data, leading to poor performance on unseen data.
  • Symptoms:
    • High accuracy on training data but poor performance on validation/test data.
    • The model captures noise and specific patterns of the training data.
  • Key Issue: Overfitted models have low bias but high variance.

2. Tools to Combat Overfitting

  • Validation Set: Monitor performance during training.
  • Regularization: Introduce penalties to the loss function.

3. Regularization: Core Concept

  • Definition: Adding a penalty term to the loss function to discourage complex models.
  • Mathematical Representation: \[ \text{Loss Function} = \underbrace{\sum (y_i - \hat{y}_i)^2}_{\text{Squared Loss}} + \underbrace{\lambda \cdot \text{Penalty}}_{\text{Regularization Term}} \]
    • \(\lambda\): Regularization strength (hyperparameter).
    • \(\text{Penalty}\): Function of model parameters to constrain them.

4. Types of Regularization

  • L1 Regularization (LASSO):

    • Penalty: \(\lambda \sum |w_i|\) (absolute values of coefficients).
    • Use Case: Feature selection (some coefficients are driven to 0, creating sparse models).
    • Interpretation: Encourages sparsity by setting weak feature coefficients to 0.

    \[ \text{Loss} = \sum (y_i - \hat{y}_i)^2 + \lambda \sum |w_i| \]

  • L2 Regularization (Ridge):

    • Penalty: \(\lambda \sum w_i^2\) (squared values of coefficients).
    • Use Case: Generalized overfitting prevention without sparsity.
    • Interpretation: Penalizes large coefficients more strongly, but all features remain non-zero.

    \[ \text{Loss} = \sum (y_i - \hat{y}_i)^2 + \lambda \sum w_i^2 \]


5. Comparison: L1 vs. L2

Aspect L1 (LASSO) L2 (Ridge)
Penalty Function Absolute (\(|w_i|\)) Squared (\(w_i^2\))
Behavior Sparse coefficients Uniform coefficient shrinkage
Use Case Feature selection Prevent overfitting
Geometry of Penalty Diamond-shaped exclusion zone Circular exclusion zone

6. Practical Implementation: Python Examples

  • Data Preprocessing:

    from sklearn.preprocessing import StandardScaler
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)  # Normalize features
  • L1 Regularization (LASSO):

    from sklearn.linear_model import Lasso
    lasso = Lasso(alpha=0.1)  # Set regularization strength
    lasso.fit(X_scaled, y)
    print("Coefficients:", lasso.coef_)
  • L2 Regularization (Ridge):

    from sklearn.linear_model import Ridge
    ridge = Ridge(alpha=1.0)  # Set regularization strength
    ridge.fit(X_scaled, y)
    print("Coefficients:", ridge.coef_)
  • Tuning Regularization Strength (\(\lambda\)):

    from sklearn.model_selection import cross_val_score
    alphas = [0.01, 0.1, 1, 10, 100]
    for alpha in alphas:
        model = Ridge(alpha=alpha)
        scores = cross_val_score(model, X_scaled, y, cv=5, scoring='neg_mean_squared_error')
        print(f"Alpha: {alpha}, Score: {-scores.mean()}")

7. Bias-Variance Tradeoff

  • High Bias (Underfitting): Model is too simple, failing to capture data patterns.
  • High Variance (Overfitting): Model is too complex, capturing noise in the data.
  • Regularization: Balances bias and variance to achieve a generalizable model.

8. Key Takeaways

  1. Regularization Strength (\(\lambda\)):
    • Controls the impact of the penalty.
    • Needs to be tuned experimentally.
  2. Interpret Coefficients:
    • Importance of scaling features to ensure comparability.
  3. Validation:
    • Use cross-validation to determine optimal \(\lambda\).
  4. L1 vs. L2:
    • L1 for feature selection, L2 for preventing overfitting.

9. Visualization and Tools

  • Visualize penalties and data fit:

    import matplotlib.pyplot as plt
    plt.plot(alphas, scores)
    plt.xlabel('Alpha (λ)')
    plt.ylabel('Cross-Validation Loss')
    plt.title('Tuning Regularization Strength')
    plt.show()

Questions from the Lecture

  1. Conceptual Question:
    What is the key difference between L1 (LASSO) and L2 (Ridge) regularization, and how does it affect model coefficients?

  2. Analytical Question:
    How does increasing the regularization strength (\(\lambda\)) affect the bias-variance tradeoff in a machine learning model?

  3. Practical Question:
    Why is it important to scale features before applying regularization, and how does it impact the interpretation of model coefficients?


Takeaways from the Lecture

  1. Understanding Overfitting:
    Overfitting occurs when a model performs exceptionally well on training data but poorly on unseen data. Regularization is a critical tool to combat overfitting by constraining model complexity.

  2. Role of Regularization Techniques:

    • L1 (LASSO) regularization introduces sparsity by driving weak feature coefficients to zero, making it ideal for feature selection.
    • L2 (Ridge) regularization uniformly shrinks coefficients, preventing overfitting without eliminating features.
  3. Importance of Hyperparameter Tuning:
    The regularization strength (\(\lambda\)) must be carefully tuned (e.g., using cross-validation) to balance model bias and variance, ensuring optimal performance on unseen data.

LS0tDQp0aXRsZTogIjczMzMgUVRXIC0gTW9kdWxlIDIiDQphdXRob3I6IEplc3NpY2EgTWNQaGF1bCANCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQojIyMgKipTdHVkeSBHdWlkZTogVW5kZXJzdGFuZGluZyBPdmVyZml0dGluZyBhbmQgUmVndWxhcml6YXRpb24gaW4gTWFjaGluZSBMZWFybmluZyoqIChpbnRlZ3JhdGluZyB0aGVvcnksIG1hdGhlbWF0aWNhbCBmb3VuZGF0aW9ucywgYW5kIHByYWN0aWNhbCBleGFtcGxlcyB0byByZWd1bGFyaXphdGlvbiB0ZWNobmlxdWVzIHRvIGhlbHAgdW5kZXJzdGFuZCB0aGVpciBpbXBhY3Qgb24gb3ZlcmZpdHRpbmcgaW4gbWFjaGluZSBsZWFybmluZy4NCg0KLS0tDQoNCiMjIyMgKioxLiBPdmVyZml0dGluZyoqDQotICoqRGVmaW5pdGlvbioqOiBPdmVyZml0dGluZyBvY2N1cnMgd2hlbiBhIG1vZGVsIGlzIG92ZXJseSB0YWlsb3JlZCB0byB0aGUgdHJhaW5pbmcgZGF0YSwgbGVhZGluZyB0byBwb29yIHBlcmZvcm1hbmNlIG9uIHVuc2VlbiBkYXRhLg0KLSAqKlN5bXB0b21zKio6DQogIC0gSGlnaCBhY2N1cmFjeSBvbiB0cmFpbmluZyBkYXRhIGJ1dCBwb29yIHBlcmZvcm1hbmNlIG9uIHZhbGlkYXRpb24vdGVzdCBkYXRhLg0KICAtIFRoZSBtb2RlbCBjYXB0dXJlcyBub2lzZSBhbmQgc3BlY2lmaWMgcGF0dGVybnMgb2YgdGhlIHRyYWluaW5nIGRhdGEuDQotICoqS2V5IElzc3VlKio6IE92ZXJmaXR0ZWQgbW9kZWxzIGhhdmUgbG93IGJpYXMgYnV0IGhpZ2ggdmFyaWFuY2UuDQoNCiMjIyMgKioyLiBUb29scyB0byBDb21iYXQgT3ZlcmZpdHRpbmcqKg0KLSAqKlZhbGlkYXRpb24gU2V0Kio6IE1vbml0b3IgcGVyZm9ybWFuY2UgZHVyaW5nIHRyYWluaW5nLg0KLSAqKlJlZ3VsYXJpemF0aW9uKio6IEludHJvZHVjZSBwZW5hbHRpZXMgdG8gdGhlIGxvc3MgZnVuY3Rpb24uDQoNCi0tLQ0KDQojIyMjICoqMy4gUmVndWxhcml6YXRpb246IENvcmUgQ29uY2VwdCoqDQotICoqRGVmaW5pdGlvbioqOiBBZGRpbmcgYSBwZW5hbHR5IHRlcm0gdG8gdGhlIGxvc3MgZnVuY3Rpb24gdG8gZGlzY291cmFnZSBjb21wbGV4IG1vZGVscy4NCi0gKipNYXRoZW1hdGljYWwgUmVwcmVzZW50YXRpb24qKjoNCiAgXFsNCiAgXHRleHR7TG9zcyBGdW5jdGlvbn0gPSBcdW5kZXJicmFjZXtcc3VtICh5X2kgLSBcaGF0e3l9X2kpXjJ9X3tcdGV4dHtTcXVhcmVkIExvc3N9fSArIFx1bmRlcmJyYWNle1xsYW1iZGEgXGNkb3QgXHRleHR7UGVuYWx0eX19X3tcdGV4dHtSZWd1bGFyaXphdGlvbiBUZXJtfX0NCiAgXF0NCiAgLSBcKCBcbGFtYmRhIFwpOiBSZWd1bGFyaXphdGlvbiBzdHJlbmd0aCAoaHlwZXJwYXJhbWV0ZXIpLg0KICAtIFwoIFx0ZXh0e1BlbmFsdHl9IFwpOiBGdW5jdGlvbiBvZiBtb2RlbCBwYXJhbWV0ZXJzIHRvIGNvbnN0cmFpbiB0aGVtLg0KICANCi0tLQ0KDQojIyMjICoqNC4gVHlwZXMgb2YgUmVndWxhcml6YXRpb24qKg0KLSAqKkwxIFJlZ3VsYXJpemF0aW9uIChMQVNTTykqKjoNCiAgLSBQZW5hbHR5OiBcKCBcbGFtYmRhIFxzdW0gfHdfaXwgXCkgKGFic29sdXRlIHZhbHVlcyBvZiBjb2VmZmljaWVudHMpLg0KICAtICoqVXNlIENhc2UqKjogRmVhdHVyZSBzZWxlY3Rpb24gKHNvbWUgY29lZmZpY2llbnRzIGFyZSBkcml2ZW4gdG8gMCwgY3JlYXRpbmcgc3BhcnNlIG1vZGVscykuDQogIC0gKipJbnRlcnByZXRhdGlvbioqOiBFbmNvdXJhZ2VzIHNwYXJzaXR5IGJ5IHNldHRpbmcgd2VhayBmZWF0dXJlIGNvZWZmaWNpZW50cyB0byAwLg0KICANCiAgXFsNCiAgXHRleHR7TG9zc30gPSBcc3VtICh5X2kgLSBcaGF0e3l9X2kpXjIgKyBcbGFtYmRhIFxzdW0gfHdfaXwNCiAgXF0NCg0KLSAqKkwyIFJlZ3VsYXJpemF0aW9uIChSaWRnZSkqKjoNCiAgLSBQZW5hbHR5OiBcKCBcbGFtYmRhIFxzdW0gd19pXjIgXCkgKHNxdWFyZWQgdmFsdWVzIG9mIGNvZWZmaWNpZW50cykuDQogIC0gKipVc2UgQ2FzZSoqOiBHZW5lcmFsaXplZCBvdmVyZml0dGluZyBwcmV2ZW50aW9uIHdpdGhvdXQgc3BhcnNpdHkuDQogIC0gKipJbnRlcnByZXRhdGlvbioqOiBQZW5hbGl6ZXMgbGFyZ2UgY29lZmZpY2llbnRzIG1vcmUgc3Ryb25nbHksIGJ1dCBhbGwgZmVhdHVyZXMgcmVtYWluIG5vbi16ZXJvLg0KDQogIFxbDQogIFx0ZXh0e0xvc3N9ID0gXHN1bSAoeV9pIC0gXGhhdHt5fV9pKV4yICsgXGxhbWJkYSBcc3VtIHdfaV4yDQogIFxdDQoNCi0tLQ0KDQojIyMjICoqNS4gQ29tcGFyaXNvbjogTDEgdnMuIEwyKioNCnwgQXNwZWN0ICAgICAgICAgICAgICAgIHwgTDEgKExBU1NPKSAgICAgICAgICAgICAgICAgICB8IEwyIChSaWRnZSkgICAgICAgICAgICAgICAgICAgIHwNCnwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLXwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLXwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLXwNCnwgKipQZW5hbHR5IEZ1bmN0aW9uKiogICB8IEFic29sdXRlIChcKCB8d19pfCBcKSkgICAgICB8IFNxdWFyZWQgKFwoIHdfaV4yIFwpKSAgICAgICAgIHwNCnwgKipCZWhhdmlvcioqICAgICAgICAgICB8IFNwYXJzZSBjb2VmZmljaWVudHMgICAgICAgICB8IFVuaWZvcm0gY29lZmZpY2llbnQgc2hyaW5rYWdlIHwNCnwgKipVc2UgQ2FzZSoqICAgICAgICAgICB8IEZlYXR1cmUgc2VsZWN0aW9uICAgICAgICAgICB8IFByZXZlbnQgb3ZlcmZpdHRpbmcgICAgICAgICAgIHwNCnwgKipHZW9tZXRyeSBvZiBQZW5hbHR5Kip8IERpYW1vbmQtc2hhcGVkIGV4Y2x1c2lvbiB6b25lIHwgQ2lyY3VsYXIgZXhjbHVzaW9uIHpvbmUgICAgICB8DQoNCi0tLQ0KDQojIyMjICoqNi4gUHJhY3RpY2FsIEltcGxlbWVudGF0aW9uOiBQeXRob24gRXhhbXBsZXMqKg0KDQotICoqRGF0YSBQcmVwcm9jZXNzaW5nKio6DQogIGBgYHB5dGhvbg0KICBmcm9tIHNrbGVhcm4ucHJlcHJvY2Vzc2luZyBpbXBvcnQgU3RhbmRhcmRTY2FsZXINCiAgc2NhbGVyID0gU3RhbmRhcmRTY2FsZXIoKQ0KICBYX3NjYWxlZCA9IHNjYWxlci5maXRfdHJhbnNmb3JtKFgpICAjIE5vcm1hbGl6ZSBmZWF0dXJlcw0KICBgYGANCg0KLSAqKkwxIFJlZ3VsYXJpemF0aW9uIChMQVNTTykqKjoNCiAgYGBgcHl0aG9uDQogIGZyb20gc2tsZWFybi5saW5lYXJfbW9kZWwgaW1wb3J0IExhc3NvDQogIGxhc3NvID0gTGFzc28oYWxwaGE9MC4xKSAgIyBTZXQgcmVndWxhcml6YXRpb24gc3RyZW5ndGgNCiAgbGFzc28uZml0KFhfc2NhbGVkLCB5KQ0KICBwcmludCgiQ29lZmZpY2llbnRzOiIsIGxhc3NvLmNvZWZfKQ0KICBgYGANCg0KLSAqKkwyIFJlZ3VsYXJpemF0aW9uIChSaWRnZSkqKjoNCiAgYGBgcHl0aG9uDQogIGZyb20gc2tsZWFybi5saW5lYXJfbW9kZWwgaW1wb3J0IFJpZGdlDQogIHJpZGdlID0gUmlkZ2UoYWxwaGE9MS4wKSAgIyBTZXQgcmVndWxhcml6YXRpb24gc3RyZW5ndGgNCiAgcmlkZ2UuZml0KFhfc2NhbGVkLCB5KQ0KICBwcmludCgiQ29lZmZpY2llbnRzOiIsIHJpZGdlLmNvZWZfKQ0KICBgYGANCg0KLSAqKlR1bmluZyBSZWd1bGFyaXphdGlvbiBTdHJlbmd0aCAoXChcbGFtYmRhXCkpKio6DQogIGBgYHB5dGhvbg0KICBmcm9tIHNrbGVhcm4ubW9kZWxfc2VsZWN0aW9uIGltcG9ydCBjcm9zc192YWxfc2NvcmUNCiAgYWxwaGFzID0gWzAuMDEsIDAuMSwgMSwgMTAsIDEwMF0NCiAgZm9yIGFscGhhIGluIGFscGhhczoNCiAgICAgIG1vZGVsID0gUmlkZ2UoYWxwaGE9YWxwaGEpDQogICAgICBzY29yZXMgPSBjcm9zc192YWxfc2NvcmUobW9kZWwsIFhfc2NhbGVkLCB5LCBjdj01LCBzY29yaW5nPSduZWdfbWVhbl9zcXVhcmVkX2Vycm9yJykNCiAgICAgIHByaW50KGYiQWxwaGE6IHthbHBoYX0sIFNjb3JlOiB7LXNjb3Jlcy5tZWFuKCl9IikNCiAgYGBgDQoNCi0tLQ0KDQojIyMjICoqNy4gQmlhcy1WYXJpYW5jZSBUcmFkZW9mZioqDQotICoqSGlnaCBCaWFzIChVbmRlcmZpdHRpbmcpKio6IE1vZGVsIGlzIHRvbyBzaW1wbGUsIGZhaWxpbmcgdG8gY2FwdHVyZSBkYXRhIHBhdHRlcm5zLg0KLSAqKkhpZ2ggVmFyaWFuY2UgKE92ZXJmaXR0aW5nKSoqOiBNb2RlbCBpcyB0b28gY29tcGxleCwgY2FwdHVyaW5nIG5vaXNlIGluIHRoZSBkYXRhLg0KLSAqKlJlZ3VsYXJpemF0aW9uKio6IEJhbGFuY2VzIGJpYXMgYW5kIHZhcmlhbmNlIHRvIGFjaGlldmUgYSBnZW5lcmFsaXphYmxlIG1vZGVsLg0KDQotLS0NCg0KIyMjIyAqKjguIEtleSBUYWtlYXdheXMqKg0KMS4gKipSZWd1bGFyaXphdGlvbiBTdHJlbmd0aCAoXCggXGxhbWJkYSBcKSkqKjoNCiAgIC0gQ29udHJvbHMgdGhlIGltcGFjdCBvZiB0aGUgcGVuYWx0eS4NCiAgIC0gTmVlZHMgdG8gYmUgdHVuZWQgZXhwZXJpbWVudGFsbHkuDQoyLiAqKkludGVycHJldCBDb2VmZmljaWVudHMqKjoNCiAgIC0gSW1wb3J0YW5jZSBvZiBzY2FsaW5nIGZlYXR1cmVzIHRvIGVuc3VyZSBjb21wYXJhYmlsaXR5Lg0KMy4gKipWYWxpZGF0aW9uKio6DQogICAtIFVzZSBjcm9zcy12YWxpZGF0aW9uIHRvIGRldGVybWluZSBvcHRpbWFsIFwoIFxsYW1iZGEgXCkuDQo0LiAqKkwxIHZzLiBMMioqOg0KICAgLSBMMSBmb3IgZmVhdHVyZSBzZWxlY3Rpb24sIEwyIGZvciBwcmV2ZW50aW5nIG92ZXJmaXR0aW5nLg0KDQotLS0NCg0KIyMjIyAqKjkuIFZpc3VhbGl6YXRpb24gYW5kIFRvb2xzKioNCi0gVmlzdWFsaXplIHBlbmFsdGllcyBhbmQgZGF0YSBmaXQ6DQogIGBgYHB5dGhvbg0KICBpbXBvcnQgbWF0cGxvdGxpYi5weXBsb3QgYXMgcGx0DQogIHBsdC5wbG90KGFscGhhcywgc2NvcmVzKQ0KICBwbHQueGxhYmVsKCdBbHBoYSAozrspJykNCiAgcGx0LnlsYWJlbCgnQ3Jvc3MtVmFsaWRhdGlvbiBMb3NzJykNCiAgcGx0LnRpdGxlKCdUdW5pbmcgUmVndWxhcml6YXRpb24gU3RyZW5ndGgnKQ0KICBwbHQuc2hvdygpDQogIGBgYA0KDQoNCiMjIyAqKlF1ZXN0aW9ucyBmcm9tIHRoZSBMZWN0dXJlKioNCjEuICoqQ29uY2VwdHVhbCBRdWVzdGlvbioqOiAgDQogICBXaGF0IGlzIHRoZSBrZXkgZGlmZmVyZW5jZSBiZXR3ZWVuIEwxIChMQVNTTykgYW5kIEwyIChSaWRnZSkgcmVndWxhcml6YXRpb24sIGFuZCBob3cgZG9lcyBpdCBhZmZlY3QgbW9kZWwgY29lZmZpY2llbnRzPw0KDQoyLiAqKkFuYWx5dGljYWwgUXVlc3Rpb24qKjogIA0KICAgSG93IGRvZXMgaW5jcmVhc2luZyB0aGUgcmVndWxhcml6YXRpb24gc3RyZW5ndGggKFwoXGxhbWJkYVwpKSBhZmZlY3QgdGhlIGJpYXMtdmFyaWFuY2UgdHJhZGVvZmYgaW4gYSBtYWNoaW5lIGxlYXJuaW5nIG1vZGVsPw0KDQozLiAqKlByYWN0aWNhbCBRdWVzdGlvbioqOiAgDQogICBXaHkgaXMgaXQgaW1wb3J0YW50IHRvIHNjYWxlIGZlYXR1cmVzIGJlZm9yZSBhcHBseWluZyByZWd1bGFyaXphdGlvbiwgYW5kIGhvdyBkb2VzIGl0IGltcGFjdCB0aGUgaW50ZXJwcmV0YXRpb24gb2YgbW9kZWwgY29lZmZpY2llbnRzPw0KDQotLS0NCg0KIyMjICoqVGFrZWF3YXlzIGZyb20gdGhlIExlY3R1cmUqKg0KMS4gKipVbmRlcnN0YW5kaW5nIE92ZXJmaXR0aW5nKio6ICANCiAgIE92ZXJmaXR0aW5nIG9jY3VycyB3aGVuIGEgbW9kZWwgcGVyZm9ybXMgZXhjZXB0aW9uYWxseSB3ZWxsIG9uIHRyYWluaW5nIGRhdGEgYnV0IHBvb3JseSBvbiB1bnNlZW4gZGF0YS4gUmVndWxhcml6YXRpb24gaXMgYSBjcml0aWNhbCB0b29sIHRvIGNvbWJhdCBvdmVyZml0dGluZyBieSBjb25zdHJhaW5pbmcgbW9kZWwgY29tcGxleGl0eS4NCg0KMi4gKipSb2xlIG9mIFJlZ3VsYXJpemF0aW9uIFRlY2huaXF1ZXMqKjogIA0KICAgLSBMMSAoTEFTU08pIHJlZ3VsYXJpemF0aW9uIGludHJvZHVjZXMgc3BhcnNpdHkgYnkgZHJpdmluZyB3ZWFrIGZlYXR1cmUgY29lZmZpY2llbnRzIHRvIHplcm8sIG1ha2luZyBpdCBpZGVhbCBmb3IgZmVhdHVyZSBzZWxlY3Rpb24uICANCiAgIC0gTDIgKFJpZGdlKSByZWd1bGFyaXphdGlvbiB1bmlmb3JtbHkgc2hyaW5rcyBjb2VmZmljaWVudHMsIHByZXZlbnRpbmcgb3ZlcmZpdHRpbmcgd2l0aG91dCBlbGltaW5hdGluZyBmZWF0dXJlcy4NCg0KMy4gKipJbXBvcnRhbmNlIG9mIEh5cGVycGFyYW1ldGVyIFR1bmluZyoqOiAgDQogICBUaGUgcmVndWxhcml6YXRpb24gc3RyZW5ndGggKFwoXGxhbWJkYVwpKSBtdXN0IGJlIGNhcmVmdWxseSB0dW5lZCAoZS5nLiwgdXNpbmcgY3Jvc3MtdmFsaWRhdGlvbikgdG8gYmFsYW5jZSBtb2RlbCBiaWFzIGFuZCB2YXJpYW5jZSwgZW5zdXJpbmcgb3B0aW1hbCBwZXJmb3JtYW5jZSBvbiB1bnNlZW4gZGF0YS4NCg==