Data 612 Discussion Assignment 3

Assignment Instructions

As more systems and sectors are driven by predictive analytics, there is increasing awareness of the possibility and pitfalls of algorithmic discrimination. In what ways do you think Recommender Systems reinforce human bias? Reflecting on the techniques we have covered, do you think recommender systems reinforce or help to prevent unethical targeting or customer segmentation? Please provide one or more examples to support your arguments.

Discussion

Recommender systems rely on user data in order to make recommendations. Inherently, user data consists of human biases and thus when this data is fed to a recommender system, these biases are further amplified, and propagated back to users in the form of system generated recommendations. At some stage or another, the system is going to re-ingest user data based on these recommendations, further amplify the biases contained within the data, serve further recommendations, and thus the downward spiral continues.

If important variables are omitted from a dataset, although not directly related to recommender systems, it can cause signifcant harm. An instance were this proved significant was when an AI healthcare project missed important data (http://people.dbmi.columbia.edu/noemie/papers/15kdd.pdf).

In the 90s, an AI project aimed to predict the probability of death for patients with pneumonia. The goal was to ensure that high risk patients were admitted to hospital, whilst low risk patients would be seen as out patients. The problem here was that machines are binary, the world is interpreted as true or false, there is no grey. The machine algorithm was trained to interpret the fact that patients with a lower risk of death from pneumonia than those who did not, should be registered as out patients. However, the algorithm missed the fact that asthma patients were immediately admitted to the emergency room when displaying pneumonia symptoms, which greatly improved there chances of survival. This information can be categorized under “grey” and thus, only a doctor would know this, so was omitted from the training data.

Stephen Haslett

6/27/2020