DATA607_Data Science in context Presentation

Charles Ugiagbe

2021-11-19

10 Data Science Ethics

Ethic 1: Identify the laws and regulations that might be applicable to our project.

It is important to consider which laws and regulations might be relevant, what these laws are designed to protect or accomplish, and what the impact may be of not taking them into account.

Ethic 2: Ensure you are achieving ethical accountability.

It should be clear who will be accountable to minimize the harm that could be done by the project. Accountability includes ensuring the project team proactively identifies potential stakeholders and evaluates harms such as possible disproportionate effects that may arise from the application of a model.

Ethic 3: Educate the workforce to recognise the risks for harm and/or unfairness

Ethic 4: How might individuals’ privacy and anonymity be impinged via our aggregation and linking of data?

While the need for anonymity is not new to the computing field, the thought process with respect to how to ensure anonymity must be re-examined with the emergence of advanced data science linking techniques. In short, consideration should be given to how privacy will be maintained through the transmission, storage and merging of the data.

Ethic 5: Identify if the data for our analysis or model is ethically available for its intended use.

Being able to access and collect data does not mean that it is ethical to use that data. Hence, care must be taken to understand who owns the data, what are their rights and expectations, and is the data being used the way that the person (or entity) that contributed the data intended?

Ethic 6: Avoiding unnecessary complexity in methods to improve transparency

Ethic 7: Identify and minimize any bias in the data or in the model

Data science machine learning models can be built using data that has a bias, and thus, the model might also learn this bias. For example, the use of machine learning algorithms has shown the capability of inheriting racial and gender biases.

Ethic 8: Consider the impact on the environment and its resources

Ethic 9: Validating and improving work

Ethic 10: Identify the likely misinterpretations of the results and what can be done to prevent those misinterpretations.

Most predictive models are statistical in nature. They provide no guarantees; rather, they tell us about areas where an increased probability of an outcome might guide us to act differently. With this in mind, the data science project manager should ensure that the analytical decisions are made as a result of a data science project reflects the scale, accuracy and precision of the data that was used in creating the model.

References:

https://www.actuaries.org.uk/system/files/field/document/An%20Ethical%20Charter%20for%20Date%20Science%20WEB%20FINAL.PDF

https://www.datascience-pm.com/10-data-science-ethics-questions/