Archive for the ‘Ethics’ Category

Data Science Ethics: Who’s Lying to Hillary Clinton?

Sunday, December 20th, 2015

The usual ethics example for data science involves discrimination against some protected class. Discrimination on race, religion, ethnicity, etc., most if not all of which is already illegal.

That’s not a question of ethics, that’s a question of staying out of jail.

A better ethics example is to ask: Who’s lying to Hillary Clinton about back doors for encryption?

I ask because in the debate on December 19, 2015, Hillary says:

Secretary Clinton, I want to talk about a new terrorist tool used in the Paris attacks, encryption. FBI Director James Comey says terrorists can hold secret communications which law enforcement cannot get to, even with a court order.

You’ve talked a lot about bringing tech leaders and government officials together, but Apple CEO Tim Cook said removing encryption tools from our products altogether would only hurt law-abiding citizens who rely on us to protect their data. So would you force him to give law enforcement a key to encrypted technology by making it law?

CLINTON: I would not want to go to that point. I would hope that, given the extraordinary capacities that the tech community has and the legitimate needs and questions from law enforcement, that there could be a Manhattan-like project, something that would bring the government and the tech communities together to see they’re not adversaries, they’ve got to be partners.

It doesn’t do anybody any good if terrorists can move toward encrypted communication that no law enforcement agency can break into before or after. There must be some way. I don’t know enough about the technology, Martha, to be able to say what it is, but I have a lot of confidence in our tech experts.

And maybe the back door is the wrong door, and I understand what Apple and others are saying about that. But I also understand, when a law enforcement official charged with the responsibility of preventing attacks — to go back to our early questions, how do we prevent attacks — well, if we can’t know what someone is planning, we are going to have to rely on the neighbor or, you know, the member of the mosque or the teacher, somebody to see something.

CLINTON: I just think there’s got to be a way, and I would hope that our tech companies would work with government to figure that out. Otherwise, law enforcement is blind — blind before, blind during, and, unfortunately, in many instances, blind after.

So we always have to balance liberty and security, privacy and safety, but I know that law enforcement needs the tools to keep us safe. And that’s what i hope, there can be some understanding and cooperation to achieve.

Who do you think has told Secretary Clinton there is a way to have secure encryption and at the same time enable law enforcement access to encrypted data?

That would be a data scientist or someone posing as a data scientist. Yes?

I assume you have read: Keys Under Doormats: Mandating Insecurity by Requiring Government Access to All Data and Communications by H. Abelson, R. Anderson, S. M. Bellovin, J. Benaloh, M. Blaze, W. Diffie, J. Gilmore, M. Green, S. Landau, P. G. Neumann, R. L. Rivest, J. I. Schiller, B. Schneier, M. Specter, D. J. Weitzner.


Twenty years ago, law enforcement organizations lobbied to require data and communication services to engineer their products to guarantee law enforcement access to all data. After lengthy debate and vigorous predictions of enforcement channels “going dark,” these attempts to regulate security technologies on the emerging Internet were abandoned. In the intervening years, innovation on the Internet flourished, and law enforcement agencies found new and more effective means of accessing vastly larger quantities of data. Today, there are again calls for regulation to mandate the provision of exceptional access mechanisms. In this article, a group of computer scientists and security experts, many of whom participated in a 1997 study of these same topics, has convened to explore the likely effects of imposing extraordinary access mandates.

We have found that the damage that could be caused by law enforcement exceptional access requirements would be even greater today than it would have been 20 years ago. In the wake of the growing economic and social cost of the fundamental insecurity of today’s Internet environment, any proposals that alter the security dynamics online should be approached with caution. Exceptional access would force Internet system developers to reverse “forward secrecy” design practices that seek to minimize the impact on user privacy when systems are breached. The complexity of today’s Internet environment, with millions of apps and globally connected services, means that new law enforcement requirements are likely to introduce unanticipated, hard to detect security flaws. Beyond these and other technical vulnerabilities, the prospect of globally deployed exceptional access systems raises difficult problems about how such an environment would be governed and how to ensure that such systems would respect human rights and the rule of law.

Whether you agree on policy grounds about back doors to encryption or not, is there any factual doubt that back doors to encryption leave users insecure?

That’s an important point because Hillary’s data science advisers should have clued her in that her position is factually false. With or without a “Manhattan Project.”

Here are the ethical questions with regard to Hillary’s position on back doors for encryption:

  1. Did Hillary’s data scientist(s) tell her that access by the government to encrypted data means no security for users?
  2. What ethical obligations do data scientists have to advise public office holders or candidates that their positions are at variance with known facts?
  3. What ethical obligations do data scientists have to caution their clients when they persist in spreading mis-information, in this case about encryption?
  4. What ethical obligations do data scientists have to expose their reports to a client outlining why the client’s public position is factually false?

Many people will differ on the policy question of access to encrypted data but that access to encrypted data weakens the protection for all users is beyond reasonable doubt.

If data scientists want to debate ethics, at least make it about an issue with consequences. Especially for the data scientists.

Questions with no risk aren’t ethics questions, they are parlor entertainment games.

PS: Is there an ethical data scientist in the Hillary Clinton campaign?

The Moral Failure of Computer Scientists [Warning: Scam Alert!]

Sunday, December 13th, 2015

The Moral Failure of Computer Scientists by Kaveh Waddell.

From the post:

Computer scientists and cryptographers occupy some of the ivory tower’s highest floors. Among academics, their work is prestigious and celebrated. To the average observer, much of it is too technical to comprehend. The field’s problems can sometimes seem remote from reality.

But computer science has quite a bit to do with reality. Its practitioners devise the surveillance systems that watch over nearly every space, public or otherwise—and they design the tools that allow for privacy in the digital realm. Computer science is political, by its very nature.

That’s at least according to Phillip Rogaway, a professor of computer science at the University of California, Davis, who has helped create some of the most important tools that secure the Internet today. Last week, Rogaway took his case directly to a roomful of cryptographers at a conference in Auckland, New Zealand. He accused them of a moral failure: By allowing the government to construct a massive surveillance apparatus, the field had abused the public trust. Rogaway said the scientists had a duty to pursue social good in their work.

He likened the danger posed by modern governments’ growing surveillance capabilities to the threat of nuclear warfare in the 1950s, and called upon scientists to step up and speak out today, as they did then.

I spoke to Rogaway about why cryptographers fail to see their work in moral terms, and the emerging link between encryption and terrorism in the national conversation. A transcript of our conversation appears below, lightly edited for concision and clarity.

I don’t disagree with Rogaway that all science and technology is political. I might use the term social instead but I agree, there are no neutral choices.

Having said that, I do disagree that Rogaway has the standing to pre-package a political stance colored as “morals” and denounce others as “immoral” if they disagree.

It is one of the oldest tricks in rhetoric but quite often effective, which is why people keep using it.

If Rogaway is correct that CS and technology are political, then his stance for a particular take on government, surveillance and cryptography is equally political.

Not that I disagree with his stance, but I don’t consider it be a moral choice.

Anything you can do to impede, disrupt or interfere with any government surveillance is fine by me. I won’t complain. But that’s because government surveillance, the high-tech kind, is a waste of time and effort.

Rogaway uses scientists who spoke out in the 1950’s about the threat of nuclear warfare as an example. Some example.

The Federation of American Scientists estimates that as of September 2015, there are approximately 15,800 nuclear weapons in the world.

Hmmm, doesn’t sound like their moral outrage was very effective does it?

There will be sessions, presentations, conferences, along with comped travel and lodging, publications for tenure, etc., but the sum of the discussion of morality in computer science with be largely the same.

The reason for the sameness of result is that discussions, papers, resolutions and the rest, aren’t nearly as important as the ethical/moral choices you make in the day to day practice as a computer scientist.

Choices in the practice of computer science make a difference, discussions of fictional choices don’t. It’s really that simple.*

*That’s not entirely fair. The industry of discussing moral choices without making any of them is quite lucrative and it depletes the bank accounts of those snared by it. So in that sense it does make a difference.

Racist algorithms: how Big Data makes bias seem objective

Sunday, December 6th, 2015

Racist algorithms: how Big Data makes bias seem objective by Cory Doctorow.

From the post:

The Ford Foundation’s Michael Brennan discusses the many studies showing how algorithms can magnify bias — like the prevalence of police background check ads shown against searches for black names.

What’s worse is the way that machine learning magnifies these problems. If an employer only hires young applicants, a machine learning algorithm will learn to screen out all older applicants without anyone having to tell it to do so.

Worst of all is that the use of algorithms to accomplish this discrimination provides a veneer of objective respectability to racism, sexism and other forms of discrimination.

Cory has a good example of “hidden” bias in data analysis and has suggestions for possible improvement.

Although I applaud the notion of “algorithmic transparency,” the issue of bias in algorithms may be more subtle than you think.

Lauren J. Young reports in Computer Scientists Find Bias in Algorithms that the bias problem can be especially acute with self-improving algorithms. Algorithms, like users have experiences and those experiences can lead to bias.

Lauren’s article is a good introduction to the concept of bias in algorithms, but for the full monty, see: Certifying and removing disparate impact by Michael Feldman, et al.


What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender, religious practice) and an explicit description of the process.

When the process is implemented using computers, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In addition, even if the process is open, it might be hard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiring access to the algorithm, we propose making inferences based on the data the algorithm uses.

We make four contributions to this problem. First, we link the legal notion of disparate impact to a measure of classification accuracy that while known, has received relatively little attention. Second, we propose a test for disparate impact based on analyzing the information leakage of the protected class from the other data attributes. Third, we describe methods by which data might be made unbiased. Finally, we present empirical evidence supporting the effectiveness of our test for disparate impact and our approach for both masking bias and preserving relevant information in the data. Interestingly, our approach resembles some actual selection practices that have recently received legal scrutiny.

Bear in mind that disparate impact is only one form of bias for a selected set of categories. And that bias can be introduced prior to formal data analysis.

Rather than say data or algorithms can be made unbiased, say rather that known biases can be reduced to acceptable levels, for some definition of acceptable.

Big Data Ethics?

Saturday, December 5th, 2015

Ethics are a popular topic in big data and related areas, as I was reminded by Sam Ransbotham’s The Ethics of Wielding an Analytical Hammer.

Here’s a big data ethics problem.

In order to select individuals based on some set of characteristics, habits, etc., we first must define the selection criteria.

Unfortunately, we don’t have a viable profile for terrorists, which explain in part why they can travel under their actual names, with their own identification and not be stopped by the authorities.

So, here’s the ethical question: Is it ethical for contractors and data scientists to offer data mining services to detect terrorists when there is no viable profile for a terrorist?

For all the hand wringing about ethics, basic honesty seems to be in short supply when talking about big data and the search for terrorists.


Encyclopedia of Ethical Failure — Updated October 2014

Sunday, November 16th, 2014

Encyclopedia of Ethical Failure — Updated October 2014 by the Department of Defense, Office of General Counsel, Standards of Conduct Office. (Word Document)

From the introduction:

The Standards of Conduct Office of the Department of Defense General Counsel’s Office has assembled the following selection of cases of ethical failure for use as a training tool. Our goal is to provide DoD personnel with real examples of Federal employees who have intentionally or unwittingly violated the standards of conduct. Some cases are humorous, some sad, and all are real. Some will anger you as a Federal employee and some will anger you as an American taxpayer.

Please pay particular attention to the multiple jail and probation sentences, fines, employment terminations and other sanctions that were taken as a result of these ethical failures. Violations of many ethical standards involve criminal statutes. Protect yourself and your employees by learning what you need to know and accessing your Agency ethics counselor if you become unsure of the proper course of conduct. Be sure to access them before you take action regarding the issue in question. Many of the cases displayed in this collection could have been avoided completely if the offender had taken this simple precaution.

The cases have been arranged according to offense for ease of access. Feel free to reproduce and use them as you like in your ethics training program. For example – you may be conducting a training session regarding political activities. Feel free to copy and paste a case or two into your slideshow or handout – or use them as examples or discussion problems. If you have a case you would like to make available for inclusion in a future update of this collection, please email it to OSD.SOCO@MAIL.MIL or you may fax it to (703) 695-4970.

One of the things I like about the United States military is they have no illusions about being better or worse than any other large organization and they prepare accordingly. Instead of pretending they are “…shocked, shocked to find gambling…,” they are prepared for rule breaking and try to keep it in check.

If you are interested in exploring or mapping this area, you will find the U.S. Office of Government Ethics useful. Unfortunately, the “Office of Inspector General” is distinct for each agency so collating information across executive departments will be challenging. To say nothing of obtaining similar information for other branches of the United States government.

Not from a technical standpoint for a topic map but from a data mining and analysis perspective.

I first saw this at Full Text Reports as Encyclopedia of Ethical Failure — Updated October 2014.

Ethics and Big Data

Monday, May 26th, 2014

Ethical research standards in a world of big data by Caitlin M. Rivers and Bryan L. Lewis.


In 2009 Ginsberg et al. reported using Google search query volume to estimate influenza activity in advance of traditional methodologies. It was a groundbreaking example of digital disease detection, and it still remains illustrative of the power of gathering data from the internet for important research. In recent years, the methodologies have been extended to include new topics and data sources; Twitter in particular has been used for surveillance of influenza-like-illnesses, political sentiments, and even behavioral risk factors like sentiments about childhood vaccination programs. As the research landscape continuously changes, the protection of human subjects in online research needs to keep pace. Here we propose a number of guidelines for ensuring that the work done by digital researchers is supported by ethical-use principles. Our proposed guidelines include: 1) Study designs using Twitter-derived data should be transparent and readily available to the public. 2) The context in which a tweet is sent should be respected by researchers. 3) All data that could be used to identify tweet authors, including geolocations, should be secured. 4) No information collected from Twitter should be used to procure more data about tweet authors from other sources. 5) Study designs that require data collection from a few individuals rather than aggregate analysis require Institutional Review Board (IRB) approval. 6) Researchers should adhere to a user’s attempt to control his or her data by respecting privacy settings. As researchers, we believe that a discourse within the research community is needed to ensure protection of research subjects. These guidelines are offered to help start this discourse and to lay the foundations for the ethical use of Twitter data.

I am curious who is going to follow this suggested code of ethics?

Without long consideration, obviously not the NSA, FBI, CIA, DoD, or any employee of the United States government.

Ditto for the security services in any country plus their governments.

Industry players are well known for their near perfect recidivism rate on corporate crime so not expecting big data ethics there.

Drug cartels? Anyone shipping cocaine in multi-kilogram lots is unlikely to be interested in Big Data ethics.

That rather narrows the pool of prospective users of a code of ethics for big data doesn’t it?

I first saw this in a tweet by Ed Yong.

Mortar’s Open Source Community

Tuesday, November 19th, 2013

Building Mortar’s Open Source Community: Announcing Public Plans by K. Young.

From the post:

We’re big fans of GitHub. There are a lot of things to like about the company and the fantastic service they’ve built. However, one of the things we’ve come to admire most about GitHub is their pricing model.

If you’re giving back to the community by making your work public, you can use GitHub for free. It’s a great approach that drives tremendous benefits to the GitHub community.

Starting today, Mortar is following GitHub’s lead in supporting those who contribute to the data science community.

If you’re improving the data science community by allowing your Mortar projects to be seen and forked by the public, we will support you by providing free access to our complete platform (including unlimited development time, up to 25 public projects, and email support). In short, you’ll pay nothing beyond Amazon Web Services’ standard Elastic MapReduce fees if you decide to run a job.

A good illustration of the difference between talking about ethics (Ethics of Big Data?) and acting ethically.

Acting ethically benefits the community.

Government grants to discuss ethics, well, you know who benefits from that.

Ethics of Big Data?

Tuesday, November 19th, 2013

The ethics of big data: A council forms to help researchers avoid pratfalls by Jordan Novet.

From the post:

Big data isn’t just something for tech companies to talk about. Researchers and academics are forming a council to analyze the hot technology category from legal, ethical, and political angles.

The researchers decided to create the council in response to a request from the National Science Foundation (NSF) for “innovation projects” involving big data.

The Council for Big Data, Ethics, and Society will convene for the first time next year, with some level of participation from the NSF. Alongside Microsoft researchers Kate Crawford and Danah Boyd, two computer-science-savvy professors will co-direct the council: Geoffrey Bowker from the University of California, Irvine, and Helen Nissenbaum of New York University.

Through “public commentary, events, white papers, and direct engagement with data analytics projects,” the council will “address issues such as security, privacy, equality, and access in order to help guard against the repetition of known mistakes and inadequate preparation,” according to a fact sheet the White House released on Tuesday.

“We’re doing all of these major investments in next-generation internet (projects), in big data,” Fen Zhao, an NSF staff associate, told VentureBeat in a phone interview. “How do we in the research-and-development phase make sure they’re aware and cognizant of any issues that may come up?”

Odd that I should encounter this just after seeing the latest NSA surveillance news.

Everyone cites the Tuskegee syphilis study as an example of research with ethical lapses.

Tuskegee is only one of many ethical lapses in American history. I think hounding native Americans to near extermination would make a list of moral lapses. But, that was more application than research.

It doesn’t require training in ethics to know Tuskegee and the treatment of native Americans were wrong.

And whatever “ethics” come out of this study are likely to resemble the definition of a prisoner of war as defined in Geneva Convention (III), Article 4(a)(2)

(2) Members of other militias and members of other volunteer corps, 
including those of organized resistance movements, belonging to a 
Party to the conflict and operating in or outside their own territory, 
even if this territory is occupied, provided that such militias or 
volunteer corps, including such organized resistance movements, 
fulfill the following conditions:

(a) that of being commanded by a person responsible for his 

(b) that of having a fixed distinctive sign recognizable at a distance;

(c) that of carrying arms openly;

(d) that of conducting their operations in accordance with the laws 
and customs of war.

That may seem neutral on its face, but it’s fair to say that major nation states and not groups that have differences with them are likely to meet those requirements.

In fact, the Laws of War Deskbook argues in part that members of the Taliban had no distinctive uniforms and thus no POW status. (At page 79, footnote 31.)

The point being discussion of ethics should be in concrete cases, so we can judge who will win and who will lose.

Otherwise you will have general principles of ethics that favor the rule makers.