A Challenge to Data Scientists by Renee Teate.
From the post:
As data scientists, we are aware that bias exists in the world. We read up on stories about how cognitive biases can affect decision-making. We know that, for instance, a resume with a white-sounding name will receive a different response than the same resume with a black-sounding name, and that writers of performance reviews use different language to describe contributions by women and men in the workplace. We read stories in the news about ageism in healthcare and racism in mortgage lending.
Data scientists are problem solvers at heart, and we love our data and our algorithms that sometimes seem to work like magic, so we may be inclined to try to solve these problems stemming from human bias by turning the decisions over to machines. Most people seem to believe that machines are less biased and more pure in their decision-making – that the data tells the truth, that the machines won’t discriminate.
…
Renee’s post summarizes a lot of information about bias, inside and outside of data science and issues this challenge:
Data scientists, I challenge you. I challenge you to figure out how to make the systems you design as fair as possible.
An admirable sentiment but one hard part is defining “…as fair as possible.”
Being professionally trained in a day to day “hermeneutic of suspicion,” as opposed to Paul Ricoeur‘s analysis of texts (Paul Ricoeur and the Hermeneutics of Suspicion: A Brief Overview and Critique by G.D. Robinson.), I have yet to encounter a definition of “fair” that does not define winners and losers.
Data science relies on classification, which has as its avowed purpose the separation of items into different categories. Some categories will be treated differently than others. Otherwise there would be no reason to perform the classification.
Another hard part is that employers of data scientists are more likely to say:
Analyze data X for market segments responding to ad campaign Y.
As opposed to:
What do you think about our ads targeting tweens by the use of sexual-content for our unhealthy product A?
Or change the questions to fit those asked of data scientists at any government intelligence agency.
The vast majority of data scientists are hired as data scientists, not amateur theologians.
Competence in data science has no demonstrable relationship to competence in ethics, fairness, morality, etc. Data scientists can have opinions about the same but shouldn’t presume to poach on other areas of expertise.
How you would feel if a competent user of spreadsheets decided to label themselves a “data scientist?”
Keep that in mind the next time someone starts to pontificate on “ethics” in data science.
PS: Renee is in the process of creating and assembling high quality resources for anyone interested in data science. Be sure to explore her blog and other links after reading her post.