Applying an ML Algorithm is Not Enough!
In the context of statistical or machine learning models, the use of metrics is essential to evaluate the performance of our estimates. Let’s explore the main features of the Mean Square Error (MSE).
\[ MSE=\frac{1}{n}\sum_{i=1}^n (y_i-\hat{f}(x_i))^2, \]
where \(y_i\) represents the observed value of the dependent variable, and \(\hat{f}(x_i)\) is its prediction that depends on a set of covariates \(x_i\), and \(n\) is the sample size.
When should it be used?
How do we interpret?
Why is it widely used?
\[ MSE=\underbrace{\text{Variance}}_{A}+ \underbrace{\text{Bias}^2}_{B} \]
The variance refers to the amount by which \(\hat{f}(x_i)\) would change if we estimate it using different samples. In a perfect world, the estimate of \(f\) should not vary too much between different samples.
Bias refers to the error that is introduced by approximating a real-life problem.
In other words, the decomposition of MSE helps the analyst understand and pinpoint the reasons why the model is performing well or not.
Drawbacks
MSE comes with a significant drawback: squaring can magnify the impact of outliers (especially extreme values), so caution is needed when dealing with such cases. In such situations, MSE might not be the most appropriate option.