UC Berkeley and Simpson’s Paradox

First described by Edward Hugh Simpson, Simpson’s paradox is a statistical phenomenon that occurs when an initial conclusion is flipped due to a data-viewpoint change that takes into account additional faceted criteria. A popular example of this paradox is the case of UC Berkeley’s 1973 graduate school acceptance rates for men and women. What happened here was that initially it appeared that there was a gender bias in the selection process.

As shown in the graph above, females were rejected at a significantly higher rate than males. The school had a statistician named Peter Bickel dive further into the data to determine if there actually was a gender bias in favor of men. To dive deeper, Bickel broke up the acceptance rates by department.

Adding a Department Facet

This graph portrays a different story than the first. According to this, there is a gender bias in favor of females. Constrast to females, males are more likely to be rejected in four of the six departments accounted for. What Bickel and his team found was that women had a higher chance to apply for more selective departments, resulting in a lower percentage of women being accepted as a whole.

Simpson’s Paradox

This example of UC Berkeley’s graduate acceptance rates is a great representitive of Simpson’s paradox. Upon first look, it appeared as though males had a gender bias in their favor. After Peter Bickel further broke down the data by department, it turned out that females actually had the gender bias in their favor. The addition of department-applied-to as an additional variable is referred to as a lurking variable. A lurking variable more broadly refers to a hidden variable that “splits data into multiple separate distributions” (Grigg). Though this example is rather simple, hidden variables are not always easy to find.

Works Cited

Grigg, Tom. “Simpson’s Paradox and Interpreting Data.” Simpson’s Paradox and Interpreting Data, Towards Data Science, 8 Jan. 2019, https://towardsdatascience.com/simpsons-paradox-and-interpreting-data-6a0443516765.

https://r-charts.com/colors/