UC Berkley’s Incident

In 1973, University of California-Berkley faced issues of gender discrimination based on their admittance data. In the graph below, similar to the one published, it depicts a drastic difference between male and female acceptance rates.

As the graph depicts, the rejection rate of female applicants was higher than the rejection rate of male applicants. With this seeming like the case, it’s no wonder UC Berkley was alleged to be discriminating against women. This one graph almost led to a lawsuit against the university.

UC Berkley’s Response

Fearing a lawsuit, the university asked someone to look at their data. The individual they asked to look at the data was a statistician and he came to a different conclusion. When he looked at the different departments at the university that applicants applied to, he noticed that women applicants were favored in 4 of the 6 departments; with an even playing field in the other 2 departments. Him and his team realized that females applied to more departments that accepted a smaller percentage of applicants in total.

In turn, by looking at the departments that applicants applied to, the initial trend of female applicants being discriminated against is no longer present. In fact, it turns out that the opposite is true; mainly that more female applicants were accepted in different departments. Being that the female applicants applied to departments that accepted smaller percentages, their rejection rates were able to skew the initial data against them.

Relation to Simpson’s Paradox

The incident at UC Berkley is a great example of a phenomenon known as Simpson’s Paradox.Simpson’s Paradox is the reversal, or disappearance, of a trend when additional variables are introduced. In this case, an initial look at the university’s acceptance rates for males and females seem to indicate discrimination against applicants based on gender.

However, when the statistician took a closer look at the top 6 departments that were applied to, a new trend was observed. This reversal, or disappearance of the trend of discrimination can be attributed to the concept of a lurking variable. A lurking variable is a variable that, when taken into account, presents data differently. In this case, the departments that were applied to would be considered a lurking variable.

Source

Simpson, David. “Simpson’s Paradox and Interpreting Data.” Towards Data Science, 13 Apr. 2020, https://towardsdatascience.com/simpsons-paradox-and-interpreting-data-6a0443516765.