When a sample does not accurately represent the true population: \[ P(\hat{\theta}) \neq \theta \]
2025-09-17
When a sample does not accurately represent the true population: \[ P(\hat{\theta}) \neq \theta \]
-Bias can lead to inaccurate results, this is especially misleaading if the data is being used to train a model
-Models that inherit biased information can lead to unfair outcomes
-Data bias can cause people to question the reliability of these models or AI systems
-These proportions do not match reality
Weighted mean:
\[ \hat{\mu}_w \;=\; \frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}, \qquad w_i \;=\; \frac{1}{\pi_i} \]
-This is showing the true population vs the biased data of only part of the population
library(dplyr) population %>% group_by(gender) %>% sample_n(50)