Data Science Ethics

Data Science Ethics
Course by Coursera
Data Science Ethics
About the course

What are the ethical considerations regarding the privacy and control of consumer information and big data, especially in the aftermath of recent large-scale data breaches?

This course provides a framework to analyze these concerns as you examine the ethical and privacy implications of collecting and managing big data. Explore the broader impact of the data science field on modern society and the principles of fairness, accountability and transparency as you gain a deeper understanding of the importance of a shared set of ethical values. You will examine the need for voluntary disclosure when leveraging metadata to inform basic algorithms and/or complex artificial intelligence systems while also learning best practices for responsible data management, understanding the significance of the Fair Information Practices Principles Act and the laws concerning the "right to be forgotten."
This course will help you answer questions such as who owns data, how do we value privacy, how to receive informed consent and what it means to be fair.
Data scientists and anyone beginning to use or expand their use of data will benefit from this course. No particular previous knowledge needed.

What are Ethics?
Module 1 of this course establishes a basic foundation in the notion of simple utilitarian ethics we use for this course. The lecture material and the quiz questions are designed to get most people to come to an agreement about right and wrong, using the utilitarian framework taught here. If you bring your own moral sense to bear, or think hard about possible counter-arguments, it is likely that you can arrive at a different conclusion. But that discussion is not what this course is about. So resist that temptation, so that we can jointly lay a common foundation for the rest of this course.
History, Concept of Informed Consent
Early experiments on human subjects were by scientists intent on advancing medicine, to the benefit of all humanity, disregard for welfare of individual human subjects. Often these were performed by white scientists, on black subject. In this module we will talk about the laws that govern the Principle of Informed Consent. We will also discuss why informed consent doesn’t work well for retrospective studies, or for the customers of electronic businesses.
Data Ownership
Who owns data about you? We'll explore that question in this module. A few examples of personal data include copyrights for biographies; ownership of photos posted online, Yelp, Trip Advisor, public data capture, and data sale. We'll also explore the limits on recording and use of data.
Privacy is a basic human need. Privacy means the ability to control information about yourself, not necessarily the ability to hide things. We have seen the rise different value systems with regards to privacy. Kids today are more likely to share personal information on social media, for example. So while values are changing, this doesn’t remove the fundamental need to be able to control personal information. In this module we'll examine the relationship between the services we are provided and the data we provide in exchange: for example, the location for a cell phone. We'll also compare and contrast "data" against "metadata".
Certain transactions can be performed anonymously. But many cannot, including where there is physical delivery of product. Two examples related to anonymous transactions we'll look at are "block chains" and "bitcoin". We'll also look at some of the drawbacks that come with anonymity.
Data Validity
Data validity is not a new concern. All too often, we see the inappropriate use of Data Science methods leading to erroneous conclusions. This module points out common errors, in language suited for a student with limited exposure to statistics. We'll focus on the notion of representative sample: opinionated customers, for example, are not necessarily representative of all customers.
Algorithmic Fairness
What could be fairer than a data-driven analysis? Surely the dumb computer cannot harbor prejudice or stereotypes. While indeed the analysis technique may be completely neutral, given the assumptions, the model, the training data, and so forth, all of these boundary conditions are set by humans, who may reflect their biases in the analysis result, possibly without even intending to do so. Only recently have people begun to think about how algorithmic decisions can be unfair. Consider this article, published in the New York Times. This module discusses this cutting edge issue.
Societal Consequences
In Module 8, we consider societal consequences of Data Science that we should be concerned about even if there are no issues with fairness, validity, anonymity, privacy, ownership or human subjects research. These “systemic” concerns are often the hardest to address, yet just as important as other issues discussed before. For example, we consider ossification, or the tendency of algorithmic methods to learn and codify the current state of the world and thereby make it harder to change. Information asymmetry has long been exploited for the advantage of some, to the disadvantage of others. Information technology makes spread of information easier, and hence generally decreases asymmetry. However, Big Data sets and sophisticated analyses increase asymmetry in favor of those with ability to acquire/access.
Code of Ethics
Finally, in Module 9, we tie all the issues we have considered together into a simple, two-point code of ethics for the practitioner.
This module contains lists of attributions for the external audio-visual resources used throughout the course.
H.V. Jagadish
H.V. Jagadish
Courses on this platform are often free of charge. However, certification and specialization in a certain domain come at a cost. All video lectures are followed by practical assignments and content for individual study. The assignments are checked by fellow students, the teacher or the system if it is a test. Some courses do not have a Russian voice-over, but Russian subtitles are available. If students do not meet the deadlines or do not complete their assignments then they are automatically moved to the next class being compiled.
Comments (180)
This couse presented some interesting case studies, but never presented a real ethical system. In particular, the instructor frequently equated ethical behavior with socially acceptable behavior (this activity by corporation x was unethical because it resulted in public outcry). Someone with a background in ethics really should have been consulted in course developement. On a purely practical matter, the ta did not seem to have any training in ethics at all. He just responded to every comment in the discussion area with a "what about" statement that sounded like it came straight from a late night dorm room bs session.
Aaron Z
I've been disappointed with the course overall. it's a very interesting topic, one that's relevant to the times we live in. But I feel that many of the big questions are neither asked nor answered in the course. The presenter's background is in engineering, not philosophy. This isn't a fundamental problem in itself, but it does mean that concepts are left very vaguely defined. So 'ethics' is, at one point, defined more or less as 'what's socially agreed-up'. But in the questions as well as the videos, the terminology shifts between 'ethically right/wrong', 'appropriate', 'have to do something,' 'by rights', 'legally', and lots more besides. So it's far from clear what the right or wrong position might be in many of the thought-experiments described in the quiz questions. What's more, if an action isn't in itself ethically right, that doesn't entail that it must be ethically wrong: it could be neither. There's virtually no input from other voices, too: no discussions with, say, experts in the field of data ethics, or moral philosophers, or whatever. At one point in week 1, the presenter correctly points out that ''data ownership is really complex''. So it'd be useful to have a MOOC that makes it less complex: that asks hard questions, and critically examines the range of possible answers. Unfortunately, this MOOC isn't it.
Daithi M W
Good course. Challenging and thought provoking. The professor was very good, but, even if you complete all assignments, your certification and grade depends upon having others in the course go in and review your assignment. Even if you review more than you are required to review, if others don't review yours, you don't pass or get your certification. So, be aware of that if you are attending a class in the hopes of obtaining a certification. It is not guaranteed even if you do all of the work.
Katherine S
This is really a great course. It is a starting point for anyone who want to get deeper into Data Science Ethics. Even if I feel that I am fairly familiar with the topic, I learned a lot and particular the case studies were really great. It should be a must for anyone in Data Science.
Kim K L
Like any other website, konevy uses «cookies». These cookies are used to store information including visitor's preferences, and the pages on the website that the visitor accessed or visited. The information is used to optimize the users' experience by customizing our web page content based on visitors' browser type and/or other information. For more general information on cookies, please read the «What Are Cookies» article on Cookie Consent website.