Case Study #3
Reading: ISLBS (pg. 14 - 36)
ISLBS: Chapter 1 Exercises (pg. 77).
Collaborators (for this assignment): [Insert name(s) of other students]
In a widely cited 2016 study, computer scientists from Princeton University and the University of Bath demonstrated that significant harmful racial and gender biases are consistently reflected in the performance of learning algorithms commonly used in natural language processing tasks to represent the relationships between meanings of words.
For example, one of the tools they studied, GloVe (Global Vectors for Word Representation), is a learning algorithm for creating word embeddings—visual maps that represent similarities and associations among word meanings in terms of distance between vectors. Thus the vectors for the words ‘water’ and ‘rain’ would appear much closer together than will the vectors for the terms ‘water’ and ‘red.’ As with other similar data models for natural language processing, when GloVe is trained on a body oftext from the Web, it learns to reflect in its own outputs “accurate imprints of [human] historic biases” (Caliskan-Islam, Bryson, and Naryanan, 2016). Some of these biases are based in objective reality (like our ‘water’ and ’rain example above). Others reflect subjective values that are (for the most part) morally neutral—for example, names for flowers (rose, lilac, tulip) are much more strongly associated with pleasant words (such as freedom, honest, miracle, and lucky), whereas names for insects (ant, beetle, hornet) are much more strongly associated (have nearer vectors) with unpleasant words (such as filth, poison, and rotten.)
However, other biases in the data models, especially those concerning race and gender, are neither objective nor harmless. As it turns out, for example, common European American names such as Ryan, Jack, Amanda, and Sarah were far more closely associated in the model with the pleasant terms (such as joy, peace, wonderful, and friend), while common African American names such as Tyrone, Darnell, and Keisha were far more likely to be associated with the unpleasant terms (such as terrible, nasty, and failure).
Common names for men were also much more closely associated with career related words such as ‘salary’ and ‘management’ than for women, whose names were more closely associated with domestic words such as ‘home’ and ‘relatives.’ Career and educational stereotypes by gender were also strongly reflected in the model’s output. The study’s authors note that this is not a deficit of a particular tool, such as GloVe, but a pervasive problem across many data models and tools trained on a corpus of human language use. Because people are (and have long been) biased in harmful and unjust ways, data models that learn from human output will carry those harmful biases forward. Often the human biases are actually concentrated or amplified by the data model.
Does it raise ethical concerns that biased tools are used to drive many tasks in big data analytics, from sentiment analysis (e.g., determining whether an interaction with a customer is pleasant), to hiring solutions (e.g., ranking resumes), to ad service and search (e.g., showing you customized content), to social robotics (understanding and responding appropriately to humans in a social setting) and many other applications? Yes.
The types of ethical challenges most relevant to the word embedding study are; Identifying and addressing ethically harmful data bias,and validation and testing of data models and analytics. This is because the algorithm they used could be in need of better understanding or updates to better associate words with other contexts.There could also be human bias in how the algorithm was created. // This line makes the text brown. //
The ethical challenges are 1)Identifying and addressing ethically harmful data bias 2)validation and testing of data models and analytics
[Data practitioners should ask how reliable their algorithm is and when big data applications are used will they provide accurate measures of the data.]
[One risk of this solution is the debiasing techniques could be biased themselves or new harmful bias could be created. Some other solutions could be to rebuild the algorithm to ignore harmful bias and ethical issues.]
[Applications of data which racial or gender biases in word embedding might cause significant ethical harm are food products, clothes, sports, and jobs. Food products, clothes, and sports that use these applications will create bias and ethical harm in associating certain food or clothes or sports to certain people while a job could post resumes and word embedding could create both racial and gender bias depending on the job vacancy.]
[The bias may have gotten into the algorithms because of recognition factors. Symmetry and lack of blemishes and wrinkles indicate blank slates are more beautiful. These factors phase those with blemishes or imperfections are not as beautiful as others. Recognition tools to recognize darker skin tones may also need updating and proper adjustments.]
In this study, the explanatory variables are x and