Recommender systems utilize aggregate data from a population of users to generate recommendations – e.g. movie ratings, housing prices, or search results. Because discrimination is present in society, it is possible that the input data within the RS will reflect or amplify these prejudices.

In Evan Estola’s 2016 presentation, “When Recommendations Systems Go Bad,” he provides several cases of this occurring including one where Google searches invovling black-sounding names were more likely to serve up ads suggestive of criminal record than white-sounding names. As a result, ads served up by this recommender were reinforcing existing racial bias against the black community.

Addressing these existing biases is not as simple as say, removing race and gender from the training data. As presented in M. Hardt et. al.’s paper Equality of Opportunity in Supervised Learning (2016), this “‘fairness through unawareness’ is ineffective due to the existence of redundant encodings, ways of predicting protected attributes from other features.” This makes intuitive sense if one considers proxy measures such as address, zip code, or neighborhood - features that could just as easily serve up recommendations containing inherent societal biases.

Fortunately, M. Hardt et.al. present a basic framework that Data Scientists and Machine Learning Practitioners can use to help address these biases. One (of many) powerful suggestion was: “Measuring unfairness, rather than proving fairness.” This concept emphasizes improving the experience of all users presented with recommendations as the incentive - something that makes sense from a business perspective as well as socio-ethical perspective.

Certainly, recommender systems can reinforce existing biases but I believe that they can play an important role in helping to eleivate them, too. Ethics in data science is an important topic that requires careful examination and action to reinforce progress and inclusion rather than simply reinforcing the status quo.