————

Equality of Opportunity in Supervised Learning

Moritz Hardt, Eric Price, Nathan Srebro

https://arxiv.org/pdf/1610.02413.pdf

————

    Recommender systems can be used to make important decisions in people’s lives. Because of their increasing importance, it is critical that they allow people and organizations to make fair and equitable decisions. This article presents an attempt to create a process to minimize discrimination based on membership to a protected group. The constraints the article tries to take into account are: the process should find an optimal set of criteria; the process should take into account the availability of information; the process should be able to use the abilities of machine learning.

    Misters Hardt, Price and Srebro spend most of the article showing that an optimal model derived using ROC curves and using standard machine learning processes will create an optimal model if it can be shown that the model depends only on the joint distribution of the probability being sought, group membership and the estimator chosen. Such a model would appropriately determine every individual’s likelihood to have the quality sought.

     Domain knowledge is important for specifying a model. If more information can be gathered, especially if it’s the correct information gathered correctly, a model will perform better.

    The article uses a case study of FICO credit scores. It compares default rate and ROC curves by racial group. It entertains the costs associated with models favoring maximized profit, equal opportunity or equal odds. In different models, different groups lose or gain. The point of view of the authors is that equalizing opportunity is the fairest model and a good middle ground.

     Modeling for credit default produces a better prediction for the largest ethnic group, whites, in the FICO example. The authors propose a model that uses an adjustment after initial model results. This is to shift incentives in an appropriate direction. Without a model, the burdens of incorrectly denying credit, resulting in a false negative, has been felt mostly by consumers. African American and Hispanic borrowers have paid a price by not receiving loans that they should have. This article suggests that the burden should be shifted to the modeler. By adding an adjustment, the model acknowledges that the model for some groups is not optimal. In order to assume less risk, a company should collect enough proper data to be able to determine default risk better for minority groups.

     The goal of an adjusted model would be one where an adjustment dwindles to zero. This would happen if the model for each ethnic group was equally optimal. Each would be appropriately responding to the joint distribution, not the distribution conditional on group membership.

    This model is a good idea. It delivers a proper incentive to predict risk as effectively as possible. It attempts to decouple risk from ethnic identity. Ultimately, providing the best estimate possible will deliver returns and efficiency to consumers and companies. Machine models can exacerbate inequality. Deployed properly, they present an opportunity to provide better performance while minimizing discrimination.