Reflecting on MIT's Moral Machine

Several weeks ago I wrote a walk-through of MIT's Moral Machine, which aims to gain a human perspective on the decisions that self driving cars may have to make in the future. You should read that post here before you read this post. This reflection contains my thoughts on the moral machine, and what I would change about it.

What is the Moral Machine?

An example scenario from MIT's Moral Machine

The website presents us with a scenario where the brakes on a self driving car have gone bad. The vehicle is careening towards a crosswalk of pedestrians, and the car has a decision to make - to switch lanes, or not to switch lanes. You are presented with 13 different scenarios in this context, but there are various parameters that will change between them. Each scenario has different numbers of people in the car and pedestrians on the crosswalk. In some scenarios, you have to choose whether the car will crash in to a concrete barrier, which would kill all passengers in the vehicle. In some scenarios the pedestrians are jaywalking past a red light, bringing obeying the law in to play. We are also presented with various types of people - old, young, fat, fit, male, female, doctor, criminal, and more. After deciding on 13 of these scenarios, the website analyses our responses and gives us details on what we seemed to base our decisions on. These statistics are then compared against the averages of everyone who has ever completed the simulation, so we can see how our responses compare to everyone else's.

Dr. Iyad Rahwan's Ted Talk

Dr. Iyad Rahwan

Iyad Rahwan is the director of the Max Planck Institute for Human Development, and an associate professor at MIT. He is credited for helping to develop the concept for MIT's moral machine. In 2017, he gave a ted talk where he discussed the moral decisions that self driving cars will have to make, what the purpose of the moral machine is, and what their research with the simulation has found. I am going to quote and respond to some points he made in this ted talk, so you may want to watch it before you read the following portion of this blog post.

Dr. Rahwan begins this talk by discussing the decisions that self driving cars will be required to make. He suggests that while some may believe that the scenarios presented in the moral machine are unrealistic, they are still important. Rahwan states that while self driving cars will eliminate 90% of traffic accidents due to the lack of human error in robotic vehicles, there will still be a small percentage of accidents remaining. How much research will it take to eliminate the remaining accidents? I believe that this is a very important point to acknowledge. While self driving cars will eliminate most traffic accidents, the ones that remain will be the most difficult ones to address. Autonomous cars will have to make trade-offs, and we must think about and discuss these trade-offs and decisions.

The scenario presented in the moral machine is based on the trolley problem, which prompts individuals to make a decision based on two traditional ethical frameworks. Jeremy Bentham was a 16th to 17th century philosopher who is regarded as being the founder of utilitarianism. Promote the most good, do the least harm. Immanuel Kant was a 16th century philosopher whose ideology surrounded duty bound principles. Kantianism falls under the umbrella of the Deontological ethical framework, which bases right or wrong on a set of rules, not on the consequences of the action. In the moral machine simulation, Bentham would choose the action whose consequence is the least harmful to all parties involved. Kant would make his decision on the rule of thou shalt not kill. Swerving the car means making the choice to kill someone, which Kant believes is wrong, so the car should be left to run it's course. Both the trolley problem and the moral machine are variations on applying one of these ethical frameworks over the other.

Dr. Rahwan says that most people who complete the moral machine side with Bentham. They tend to make decisions that favor the least harm. This means that people think autonomous cars should make decisions that save the most lives, even if that means sacrificing the passengers of the vehicle. Rahwan goes on to say that when people are surveyed about buying an autonomous vehicle that makes decisions this way, they say they would "absolutely not" purchase them. People want to buy cars that will protect them at all costs, even if that cost is against the common good of all people, which is the least overall harm. This is when Dr. Rahwan begins to set up the real conversation that is to be had surrounding the moral machine: we believe that other people's autonomous vehicles should protect the common good and do the least harm, but I want my vehicle to protect my life at all costs, and I won't buy a vehicle that doesn't protect me.

So if people don't think an autonomous vehicle protects them at all costs, they won't buy one. If people don't buy autonomous cars, then automobile accidents will not decrease at all. So how should regulators and lawmakers go about solving this catch-22? Do we want vehicles that increase the common good that nobody buys, or vehicles that protect their passengers and may sacrifice pedestrians in the process? Is the former actually even protecting the common good since it may result in more overall deaths?

This brings us to the heart of what the moral machine is really all about. Yes, the simulation seeks to "form an early picture of what trade-offs people are comfortable with". But Dr. Rahwan even says that the more important goal of the machine is to help participants recognize the difficulty of the decisions that autonomous vehicles, the companies that program them, and the regulators of these companies have to make. This tool is educational. The main benefit the moral machine has to us is to educate us on how difficult these decisions actually are. Which trade-offs are we comfortable with? How can we better understand the decisions that regulators are already making?

At one point in the Ted Talk (at timestamp 11:50), Dr. Rahwan seems to make light of a certain participant's responses which show that this person tends to punish jaywalking. Their results show that they favor pedestrians who are obeying the law 100% of the time. If you read my first post, you'll remember my result on "obeying the law" was the same as this person's. I made an argument and justification for this result that is, in my opinion, an ethical approach to the decisions that are having to be made. When Dr. Rahwan and the audience laugh at this person's response, I can't help but wonder how this person came to this result? Was this some internet troll, or did this person have logical and ethical arguments for the decisions they made? We will never know, because the moral machine does not collect any data on why any individual participant made the decisions they did. I will discuss this later in this post (see What didn't they ask? What did I want to say? below). I believe this is a big problem with the moral machine.

So my belief after completing the moral machine and listening to Dr. Rahwan's talk is that this is mainly an educational tool meant to create conversation on the types of decisions that regulators will have to make and the questions we should be asking of our autonomous vehicles. This makes me seriously question several design decisions that were made on the moral machine website and the simulation. Why are they not collecting data on the rationale behind people's decisions? Why do the results of the simulation not promote discussion between yourself and people who made different decisions than you did? Because of the limited scope of the moral machine simulation, I believe it's ability to educate it's participants and create meaningful conversation are limited. I will discuss how I would change the machine to address these concerns below.

What would I change about MIT's Moral Machine?

What did I want to know?

At the end of the simulation, I was presented with a page of statistics about my responses which measures which parameters I valued more during the simulation. It also compares my data to the averages of everyone who has completed the moral machine simulation.

While I guess that it's interesting that other people value higher "social value preference" than me, or that I value "upholding the law" more than the average person, I don't know what any of this data means. What is the context of this data? What do my responses tell me about my ethics, or how they compare to everyone else? The usefulness of this data is not clear to me. The actual simulation itself is a thought provoking experiment, but the data feels like a missed opportunity. I feel like they could've done more with participant reflection after completing the simulation, which leads me to my next point.

What do others say about the specific scenarios?

I know how I justified each choice I made during the simulation, but I don't know how anyone else justified their choices. Am I right or wrong? How did others differ from me? I would like to hear the perspective of people who answered differently than me so that I can evaluate the ethics of my decisions. The moral machine website itself is not facilitating any sort of conversation - you only complete the simulation, see your own data, and see the average person's data. I want to see who agrees and disagrees with me. I want to hear input from others. There needs to be more conversation here.

What didn't they ask? What did I want to say?

While each scenario in the simulation gives you a binary choice of swerving the self-driving car or not, the simulation does not ask for any additional information in that choice. Are they only interested in what I would do, and not my justification of my decision? Even the about page on the moral machine website says that the automation of vehicles "calls for not just a clearer understanding of how humans make [decisions on human life and limb], but also a clearer understanding of how humans perceive machine intelligence making such choices." I feel that acting ethically is not always instinctual - it is aided by forethought, conversation, and reflection. The thought process that brought me to my decision is more interesting than my decision itself; therefore, the simulation should ask me to document that. Did people who made the same choice as me have the same justification that I did? Why are people making the decisions that they are making?

Are there interesting demographic factors that divide people on given scenarios?

While the results of the simulation evaluate my preference on gender, social value, age, and fitness of the fictional characters in the simulation, it does not ask me any identifying characteristics about myself. Would it be interesting to see if men and women differed on certain scenarios? Or would it be interesting to see if older people seemed to value different things during the simulation than young people did? These seem like they could be valuable data points for research purposes, but they are not being collected.

What would I improve about the moral machine?

Before the Simulation

Not a lot of context or information is given by the website before starting. To ensure the user fully understands what their task is, a short briefing before the simulation on what is going on would be helpful. Collecting data about the gender or age of each participant could be useful for research purposes.

During the Simulation

The moral machine hides useful data to the participant unless "show description" is clicked. The mobile version makes it even less clear that there is additional information to be read on each scenario. Some users may not notice all of the parameters in each scenario. This information should not be hidden when the page loads. The simulation should ask for additional written information from the participant on why they made the decision they did.

After the Simulation

The data presented after completing the simulation does not facilitate any sort of conversation because you are not able to compare your decision process with anyone else's. You are only able to compare the average of your responses with the average of everyone else's averages. I believe that more information on how I differed from other people, what kinds of people I differed from, and why I differed from them would've made the simulation a more educating experience for me.

web adventure by reece