-
Notifications
You must be signed in to change notification settings - Fork 0
/
Intro-DS-ethics.qmd
200 lines (129 loc) · 15.7 KB
/
Intro-DS-ethics.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
---
title: "What is Data Science Ethics?"
subtitle: "defining data science ethics, its importance, and how it combines the discplines of data science and ethics."
format:
html:
grid:
margin-width: 35rem
toc: true
toc_float: true
toc-title: "Page Contents"
bibliography: bibliographies/DSethics-summer2023.bib
link-citations: true
csl: bibliographies/apa-6th-edition.csl
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, fig.align = "center")
```
```{r}
#| out.width: '100%'
#| out.height: '20%'
#| fig.cap: "A comic from [Evil AI Cartoons](https://www.evilaicartoons.com/)."
#| fig-alt: "A comic from Evil AI Cartoons in black and white where three people are voicing their wants to a computer (e.g., we want AI fairness). The computer responds, 'And I want infinite battery! Talk to me when you've negotiated tradeoffs!'."
#| label: fig-AI-ethics
knitr::include_graphics("images/AI-ethics.jpg")
```
------------------------------------------------------------------------
# What is Data Science?
------------------------------------------------------------------------
Data science courses are often focused on transforming data into data model(s), but data science as a field encompasses all the processes needed to answer questions with data: from "Problem Definition" to "Deployment and Use".
```{r}
#| fig-cap: "Processes involved in data science. The amount of processing required to get the outputted object (e.g., raw data, data, a tuned model, etc.) increases as we move from left to right."
#| label: fig-ds_process
knitr::include_graphics("images/DS-stack.png")
```
Below, we walk through the data science processes described in @fig-ds_process with a hypothetical example to illustrate what happens at each processing step and how these steps relate to one another.
#### Problem Definition
defines the question we aim to answer with data. We answer questions like what counts as \`\`success" (i.e., when do we say a data model is successful), and how can we actually measure (or approximate) our event of interest?
#### Raw Data
the information collected from interactions with the world.
#### Data
the processed form of the raw data. This often is the tabularized version of the raw data.
#### Data Model(s)
the product created from running the input data through a learning algorithm (i.e., a mathematical formula that predicts an output for a given input). Data Models aim to generalize the relationship between variables in the data.
#### Tuned Data Model(s)
a data model in which the model's parameters (including hyperparameters) are adjusted, usually to better balance the model's generalizability and (prediction) accuracy for the population of interest. In practice, tuning is typically done by either splitting the data into training and test sets or by performing cross-validation.
#### Deployment and Use
generation of predictions (or other output) from the tuned data model. This is the stage where we ask questions like where should the system be used, who should be using it, and who/what should the data model be used on?
::: {.callout-warning icon="false" collapse="true"}
### Data Science Example
**Problem Definition:** suppose we aim to answer: what is the likelihood that a customer purchases product S? Only when our model (built on training data) achieves an overall accuracy $\ge$ 0.75 (for the test data) do we say that our model is successful.
**Raw Data:** our raw data includes information from each time the customer clicks on an advertisement for product S, including the timestamps for each advertisement interaction and the customer's demographic information.
**Data:** a data table where each row represents a unique customer and the columns are the variables that describe that customer. These columns include information from the raw data (e.g., the number of times they click on an advertisement for product S, their age, etc.) and also any engineered variables (e.g., their average time between advertisement clicks). During data generation, we also decide what to do with any missing values (i.e., do we leave them, ignore them, or impute them?).
**Data Model(s):** we choose to use logistic regression, where our response variable is the binary indicator of whether or not the customer purchased product S. Our explanatory variables are the customer's age, their number of advertisement clicks, and the average time between advertisement clicks. The general form of our data model would be:
$$
\begin{split}
\textrm{logit}(p(\textrm{purchase S})) = \beta_0 + \beta_1 \cdot \textrm{age} + \\ \beta_2 \cdot \textrm{ad-click number} + \beta_3 \cdot \textrm{average time between clicks}
\end{split}
$$
**Tuned Data Model(s):** we choose to do 5-fold cross-validation to tune our data model's parameters in order to maximize our model's overall accuracy at a particular cutoff value. With logistic regression, the model tuning comes as a choice of which variables to include (because logistic regression does not have any hyperparameters).
**Deployment and Use:** we ultimately decide that we should only use our tuned data model on US customers who are under a certain age. Also, we determine that only data scientists at the company who are selling product S should be able to access the data model and generate predictions with it.
:::
------------------------------------------------------------------------
# What is Ethics?
------------------------------------------------------------------------
Ethics is comprised of three main branches:
::: column-margin
### Example Questions
:::
#### 1. Applied Ethics
concerns the treatment of "moral problems, practices, and policies in personal life, professions, technology, and government," [@applied-ethics].
::: column-margin
**Applied Ethics:** Should we ever deploy predictive policing algorithms? If there is a shortage of ventilators, who should get one?
:::
#### 2. Ethical Theory
concerns "the articulation and justification of the fundamental moral principles that govern how we should live and what we ought to morally do," [@ethical-theory]. Types of overarching ethical theories include *consequentialism* [@StanConsequentialism], *deontological ethics* [@StanDeontological], and *virtue ethics* [@StanVirtue]. It is important to note that many ethical theories are very abstract. As such, ongoing philosophical work is committed to offering, critiquing, and translating abstract ethical theories into advice about how we should actually live and what is morally permissible in practice. It is also worth noting that some philosophers just focus on what we should do in specific cases and do not appeal to overarching ethical theories at all.
::: column-margin
**Ethical Theory:** Why should I be just? What constitutes respecting others?
:::
#### 3. Metaethics
explores "the status, foundations, and scope of moral values, properties, and words," [@metaethics].
::: column-margin
**Metaethics:** When we say an action is morally "wrong", do we mean the action has a certain feature that is bad, that I have a negative feeling towards the action, or something else? How do we come to know whether moral claims (e.g., we should respect others) are true or false?
:::
A key takeaway is that ethics is concerned with *normative* questions. For instance, ethics attempts to answer not what person A morally values, but rather what person A *should* morally value. Similarly, ethics is interested in what decision person A *should* make in a given context and not what decision person A actually makes (or is likely to make) in that context.
Another important distinction is between ethics and the law. Actions can be legal without being morally permissible. Conversely, actions can be illegal but still morally permissible. For example, it seems morally impermissible to plagiarize a paper in school even though it is not against the law to do so. As such, we cannot simply appeal to the law in order to understand what is morally (im)permissible. Rather, we need to defend and appeal to moral principles. For instance, we might defend that plagiarizing a paper is morally impermissible by arguing that it is deceptive. That is, plagiarism misrepresents another person's work as one's own without properly crediting them (e.g., citing them). However, is deception always morally impermissible? Suppose a person deceives their friend about their whereabouts to keep the friend's surprise birthday party a secret. In this case, the friend's deception does not seem morally impermissible. What, then, if not merely being a case of deception, explains the intuition that plagiarizing a school paper is morally impermissible? Ethics aims to answer such questions and, more broadly, offer a methodical way of approaching such questions and, with that, making morally good decisions.
------------------------------------------------------------------------
# What is Data Science Ethics?
------------------------------------------------------------------------
Data science ethics is usually considered a subfield of applied ethics. However, case studies in data science ethics can also be used to explore questions in ethical theory and metaethics. Per @DS-ethics-def, data science ethics studies and evaluates moral problems related to:
#### 1. Data
including generation, recording, processing, dissemination, and sharing
#### 2. Algorithms
including artificial intelligence, machine learning, large language models, and statistical learning models
#### 3. Corresponding Practices
including responsible innovation, programming, hacking, and professional codes[^1]
[^1]: In the context of data science ethics, professional codes are guidelines that outline the ethical standards and responsibilities for those engaged in data science work. Many companies and organizations have produced professional codes related to data science ethics, such as [Microsoft](https://www.microsoft.com/en-us/ai/responsible-ai?ef_id=_k_CjwKCAjw65-zBhBkEiwAjrqRMOOu4pQdMiA-H3F4IAPwAFy6P5AteC8WR1R1ry6SETKE3Zhlhgi4ABoC5HIQAvD_BwE_k_&OCID=AIDcmm1o1fzy5i_SEM__k_CjwKCAjw65-zBhBkEiwAjrqRMOOu4pQdMiA-H3F4IAPwAFy6P5AteC8WR1R1ry6SETKE3Zhlhgi4ABoC5HIQAvD_BwE_k_&gad_source=1&gclid=CjwKCAjw65-zBhBkEiwAjrqRMOOu4pQdMiA-H3F4IAPwAFy6P5AteC8WR1R1ry6SETKE3Zhlhgi4ABoC5HIQAvD_BwE), [IBM](https://www.ibm.com/watson/assets/duo/pdf/everydayethics.pdf), and the [United Nations](https://www.unesco.org/en/artificial-intelligence/recommendation-ethics)
A famous case study that highlights the moral problems related to data science is the Correctional Offender Management Profiling for Alternative Sanctions algorithm (COMPAS). COMPAS generates a risk score for each defendant based on their predicted likelihood of being convicted. This risk score is then used to inform decisions in the United States criminal justice system (e.g., to set bond amounts, determine criminal sentencing, and decide early release for parole). In 2016, ProPublica researchers found that Black defendants were twice as likely as white defendants to be falsely labeled as recidivists by COMPAS [@COMPAS]. Additionally, white defendants were more likely to be mislabeled as having a lower risk of recidivism than Black defendants [@COMPAS]. As such, @COMPAS concluded that COMPAS was unfair to Black defendants. However, other literature complicates what it means for an algorithm to be *fair*. For instance, some argue that fairness requires *predictive parity*, which in the case of COMPAS means that if Black and white defendants were each given the same risk score, they would be equally likely to recidivate [@COMPAS]
Yet, researchers have found that when base rates are different, as they are for recidivism across Black and white defendants in the United States, an algorithm cannot simultaneously satisfy *equal false positive rates*, *equal false negative rates*, and *predictive parity* across groups [@kleinberg2016], [@chouldechova2017], [@Corbett-Davies]. This has prompted discussions about what is actually required for algorithmic fairness and whether statistical criteria, like *equal false positive rates*, *equal false negative rates*, and *predictive parity*, actually track important aspects of the normative concept of fairness.[^2] Thus, we need to understand the normative concept of fairness in order to assess whether COMPAS is unfair to Black defendants and, moreover, whether it is even possible in principle for COMPAS to be fair given the unequal recidivism base rates across Black and white defendants in the United States.
[^2]: See @fairness-book for an overview of fairness measures in machine learning.
------------------------------------------------------------------------
# Importance of Data Science Ethics
------------------------------------------------------------------------
There are a number of critical decision points in data science, which can lead to moral problems in data, algorithms, and corresponding practices. @fig-DS-stack-values connects potentially morally charged decision points with data science processes.
```{r}
#| fig-cap: "Some key decision points that could produce morally charged outcomes have been added to the data science processes diagram. A data scientist may or may not be aware of these decision points or responsible for all these decisions in practice."
#| fig-alt: "data science pipeline labeled in blue with some of the key decision points which might produce morally charged outcomes. E.g., for the 'data model(s)', model assumptions are noted in blue."
#| label: fig-DS-stack-values
knitr::include_graphics("images/DS-stack-values.png")
```
Given the commonly held belief that mathematics, and consequently statistics, is objective in the sense that it is not influenced by factors such as the practitioner's moral values, potentially morally charged choices in data science are often made implicitly, without the decision-maker reflecting on, for example, how opting one choice over another (mis)aligns with their best judgment about what they ought to do or their moral duties to stakeholders. The ethical considerations at data science decision points must be made explicit: both the existence of a choice and the moral implications of the practitioner's ultimate decision are key aspects of each data science stage.
Data science ethics aims to illuminate the moral implications of choices within data science and takes an interdisciplinary perspective on aligning our data science practices with what we ought to do and our moral duties to stakeholders. For example, returning to the COMPAS example, data science ethics would address questions such as: what does it mean for an algorithm to be fair or just? Does algorithmic fairness or justice require satisfying some statistical criteria, and if so, which one(s)? Does Equivant, the company that made COMPAS, have a duty to make their algorithm transparent, explainable, or even fair? Engaging with such questions, and with data science ethics more generally, is critical to ensuring morally permissible data science practices. This engagement is particularly important given we live in the age of Big Data, where decisions with high moral stakes, like pretrial release [@COMPAS], home loan approvals [@homeloan-biased], and Child Protective Service's welfare visits [@ChildWelfare-biased], are increasingly being influenced by data science.
## More Case Studies
::: link
::: link-header
#### {{< iconify icons8:pdf >}} Markkula Center for Applied Ethics
:::
::: link-container
<a href = "https://www.scu.edu/ethics/focus-areas/internet-ethics/cases/" target = "_blank">Data science ethics case studies</a>
:::
:::
<br>
::: link
::: link-header
#### {{< iconify icons8:pdf >}} Federal Anti-Discrimination Agency
:::
::: link-container
<a href = "https://www.antidiskriminierungsstelle.de/SharedDocs/downloads/EN/publikationen/Studie_en_Diskriminierungsrisiken_durch_Verwendung_von_Algorithmen.pdf?__blob=publicationFile&v=2" target = "_blank">Data science ethics case studies (see chapter 4)</a>
:::
:::