Error Metrics Cheatsheet

Author

D.McCabe

Estimator Prediction Error

Here sample variance is the be-all-and-end-all of understanding estimator prediction error.

…variability in our estimates is what makes them not imprecise. An important aspect of statistics is quantifying the variability in our estimates.

      - Brian Caffo, JHU Data Science Specialization/Statistical Inference/Module 2/Variability

An imprecise point estimate alone is essentially meaningless. Without variability across samples, we would have no way to detect or define imprecision. The very process of measuring precision fundamentally depends on variability. Rather than being an enemy of precision, the standard error is what enables us to understand and quantify it.

We can typically reduce the standard error in a point estimator by increasing the input sample size as shown in this short study

Name Description Definition/Formula
MSE: Mean Squared Error Avg. squared error between actual and prediction. Used in estimation (frequentist analysis) and regression \[\mathit{MSE}=\frac{1}{n} \sum_i (y_i-\hat y_i)^2 = \mathbb E\!\big[(\widehat{\theta}-\theta)^2\big]\] \[\mathit{MSE}(\widehat{\theta}) = \underbrace{\mathbb Var(\widehat{\theta})}_{\mathbb E[(\widehat{\theta}-\mathbb E[\widehat{\theta}])^2]} + \underbrace{\Big(\mathbb E[\widehat{\theta}] - \theta\Big)^2}_{\text{Bias}(\widehat{\theta})^2} \]
RMSE: Root Mean Squared Error Avg. squared error between actual and prediction. MSE represented in the same units as the target parameter \[\mathit{RMSE}=\sqrt{MSE}=\sqrt{\frac{1}{n} \sum_i (y_i-\hat y_i)^2}\]
MAE: Mean Average Error Robust measure (less influenced by outliers), penalises big errors more heavily than \(\mathit{RMSE}\) \[\mathit{MAE}=\frac{1}{n} \sum_i |y_i-\hat y_i|\]
Variance Spread of data around its mean \[\text{Var}(X)=\frac{1}{n-1} \sum_i (x_i-\bar{x})^2\qquad\text{i.i.d. dof }=n-1\]
Standard Deviation (SD) Spread of data in original units \[\sigma=\sqrt{\text{Var}(X)}\]
Bias Systematic error \[\mathit{Bias}=\frac{1}{n} \sum_i (y_i-\hat y_i)\]
SE: Standard Error (doesn’t consider bias) Variability of an estimated statistic \(\hat{\theta}\) with repeat experiment \[\mathit{SE}=\frac{\sigma}{\sqrt{n}}\]
Metric What It Measures Formula Use Case
R^2 Proportion of target variance explained by the model \(1-\frac{\sum_i (y_i-\hat y_i)^2}{\sum_i (y_i-\bar y)^2}\) Evaluating model fit
Coefficient of Variation (CV) Ratio of SD to the Mean \(\frac{\sigma}{\bar{x}}\) Compare variability across units or scales

Common Statistical Relationships

Here are key relationships between common statistical metrics:

  1. Bias–Variance Decomposition for MSE
    \[ \mathbb E[(y-\hat y)^2] = [\mathbb E(\hat y) - y]^2 + \mathbb Var(\hat y) + \sigma^2 \]

  2. Root Mean Squared Error
    \[ RMSE = \sqrt{MSE} \]

  3. Standard Deviation and Variance
    \[ SD = \sqrt{Var(X)} \]

  4. Standard Error of the Mean
    \[ SE(\bar{x}) = \frac{SD}{\sqrt{n}} \]

  5. R-squared and Mean Squared Error
    \[ R^2 = 1 - \frac{MSE}{Var(Y)} \]

  6. Coefficient of Variation
    \[ CV = \frac{SD}{\bar{x}} \]


Summary

  • MSE captures error (and can be broken down into bias and variance).
  • SD, Variance, SE, and CV measure spread and precision.
  • relates prediction error (MSE) to total variability.

Key ROC Metrics

Metric Formula Description
True Positive Rate (TPR) / Sensitivity \(TPR = \frac{TP}{TP + FN}\) Proportion of actual positives correctly identified
False Positive Rate (FPR) \(FPR = \frac{FP}{FP + TN}\) Proportion of actual negatives incorrectly identified
True Negative Rate (TNR) / Specificity \(TNR = \frac{TN}{TN + FP}\) Proportion of actual negatives correctly identified
False Negative Rate (FNR) \(FNR = \frac{FN}{FN + TP}\) Proportion of actual positives missed
ROC Curve Plot of TPR vs FPR Visualizes classifier performance at all thresholds
AUC (Area Under Curve) Value between 0 and 1 Overall ability of the model to discriminate classes

Binary Classification Metrics (Beyond ROC)

Metric Formula Use
Accuracy $$ Overall correctness
Precision $$ Positive predictive value
Recall (TPR) $$ Sensitivity
F1 Score $2 $ Balance between precision & recall
Log Loss $- y_i (y_i) + (1 - y_i) (1 - y_i)$ Penalizes false confidence
Matthews Correlation Coefficient (MCC) $$ Balanced performance summary

Regression Metrics

Metric Formula Use
MSE / RMSE Already covered Overall prediction error
MAE Already covered Robust error
R² / Adjusted R² $R^2 = 1 - $ Fit quality
MAPE (Mean Absolute Percentage Error) $\frac{100%}{n} $ Relative error — %
Huber Loss Piecewise quadratic/linear loss Robust to outliers
Explained Variance $(y)/(y)$ Proportion of variance explained

Model Selection / Complexity Metrics

Metric Formula Use
AIC Already covered Fit + complexity
BIC Already covered More complex penalty
Adjusted R² $1 - ( )$ Adjusts R² for # predictors
Cross-Validation Error Average out-of-fold error Generalization estimation

Probability & Calibration Metrics

Metric Formula Use
Brier Score $ (y_i - p_i)^2$ Probabilistic accuracy
Calibration Curve Checks how predicted probabilities align with reality

Model Selection Criteria

Metric Formula Description
AIC \(AIC = 2k - 2 \ln(\mathcal{L})\) Akaike Information Criterion — balances fit and model complexity
AICc \(AIC_c = AIC + \frac{2k(k + 1)}{n - k - 1}\) Corrected AIC — adjusts AIC for small sample sizes
BIC \(BIC = \ln(n) \cdot k - 2 \ln(\mathcal{L})\) Bayesian Information Criterion — stronger penalty for model complexity