In 1973 UC Berkely admitted 45% of its male applicants and 35% female applicants. suspected gender bias in 1973. The graduate school was facing backlash for admitting more men than women and were being sued for it. So they had a statistician look over the data and found out that women applied to departments that admitted a smaller percentage of applicants and that’s why the gender numbers were totally different.

The figure above shows that more male applicants were admitted more than female applicants.

The figure above shows that a few female applicants applied to department A and B. but applied more for departments D through F and were admitted more in those departments. which caused the confusion for the females who thought the school admitted more men than female.

Simpsons Paradox refers to a statistical phenomenon where an association between two variables in a population emerges, disappears or reverses when the population is divided into subpopulations.It’s also a trend or result that is present when data is put into groups that reverses or disappears when the data is combined. Simpson’s paradox showcases the importance of skepticism and interpreting data with respect to the real world, and also the dangers of oversimplifying a more complex truth by trying to see the whole story from a single data-viewpoint.

A lurking variable is A variable that is neither the explanatory variable nor the response variable but has a relationship (e.g. may be correlated) with the response and the explanatory variable. It is not considered in the study but could influence the relationship between the variables in the study.

Reference: Tom Griggs ” Simpson’s Paradox and Interpreting Data” https://towardsdatascience.com/?source=post_page-----6a0443516765-------------------------------- https://online.stat.psu.edu/stat500/lesson/1/1.1/1.1.4