This is Part 2 of a 7-part series revisiting the trolley problem with deeper philosophical tools.

In 1971, the philosopher John Rawls proposed a thought experiment that remains one of the most influential ideas in political philosophy. Imagine you are designing the rules of a society, but you don't know what position you'll occupy within it. You don't know whether you'll be rich or poor, healthy or sick, part of the majority or a marginalized minority. Behind this "veil of ignorance," Rawls argued, rational self-interest would lead you to design fair institutions, because you'd want to protect yourself in case you ended up in the worst-off position.[1]

Rawls called this starting point the "original position." It was meant to be a thought experiment about justice. It may also be one of the more useful frameworks available for thinking about algorithmic fairness.

The Original Position as Design Methodology

The veil of ignorance works because it converts empathy from a feeling into a procedure. You don't need to care about other people to design fair systems behind the veil. You just need to care about yourself, while acknowledging that "yourself" could be anyone.

A silhouetted figure standing before multiple doorways, each leading to a different life circumstance, unable to see which door leads where
Behind the veil, you might be anyone. The system you design has to work for all of them.

Applied to a self-driving car's ethics: would you accept the vehicle's decision framework if you didn't know whether you'd be the passenger or the pedestrian? The young person or the elderly one? The jaywalker or the person lawfully crossing? Behind the veil, you'd likely want a system that doesn't categorically sacrifice any group, because you might belong to that group.

Applied to medical triage: would you accept a QALY-based allocation system if you didn't know whether you'd be able-bodied or living with a disability?[2] Quality-adjusted life years assign lower value to years lived with disability. Behind the veil, where you might be the person whose life the algorithm discounts, that metric looks considerably less neutral.

Applied to hiring algorithms: would you accept the screening criteria if you didn't know your gender, the name of your university, or whether you'd taken a career break to care for a family member? Behind the veil, "culture fit" as a hiring metric begins to look less like a neutral standard and more like a way of reproducing whatever the existing culture happens to be.[6]

Applied to content moderation: would you accept the platform's rules if you didn't know whether you'd be the person posting, the person harmed by the post, or the moderator reviewing it at scale? The answer likely depends on which position feels most vulnerable to you, and behind the veil, you can't know.

Maximin and Its Critics

Rawls argued that behind the veil, rational people would adopt what he called the "maximin" principle: choose the arrangement that maximizes the minimum outcome. In other words, design the system so that the worst-off position is as good as possible, because you might be the one in it.[1]

A floor of tiles where most glow steadily but a cluster in one corner flickers and dims, representing the users who fare worst under an algorithm optimized for averages
Maximin says: optimize for the experience of the users who fare worst, because behind the veil, you might be one of them.

For algorithmic systems, maximin suggests something specific: rather than optimizing for average performance across all users, optimize for the experience of the users who fare worst. An algorithm that works well for 95% of users but systematically fails for 5% might look good on aggregate metrics. Behind the veil, where you might be in that 5%, the aggregate is cold comfort.

This connects to a real debate in algorithmic fairness research. "Fairness through unawareness," the approach of simply removing protected attributes like race and gender from the model, often fails because algorithms can find proxies such as zip codes or browsing patterns, as researchers have documented.[3] "Fairness through awareness," which explicitly models group differences to ensure equitable outcomes, is more aligned with what the veil of ignorance suggests. Behind the veil, you'd want the system to account for your potential disadvantage, not pretend it doesn't exist.

But the veil has critics. Some philosophers argue it assumes an unrealistic degree of risk aversion. Not everyone behind the veil would choose maximin; some might gamble on ending up in the advantaged position and prefer a system with higher highs and lower lows.[4] Communitarian thinkers like Michael Sandel have pointed out that the veil is fundamentally individualistic.[7] It asks what a single rational person would choose for themselves, not what a community would choose together. It doesn't account for relationships, solidarity, or the ways that fairness is shaped by shared history rather than abstract principles.

The Practical Challenge

The deepest problem with applying the veil of ignorance to algorithm design is that engineers are never actually behind the veil. They know exactly who they are. They know their gender, their education, their socioeconomic position, their relationship to the technology they're building. Their biases come with them to the whiteboard.

An engineer at a workstation whose reflection in the screen shows a different person, representing the challenge of imagining yourself in another's position
Engineers are never behind the veil. Their biases come with them to the whiteboard.

The Moral Machine experiment, which collected over 40 million moral decisions from people in 233 countries, illustrated this vividly.[5] Participants' choices about who a self-driving car should save varied significantly across cultures, correlating with factors like a country's economic inequality and institutional strength. People weren't reasoning from behind a veil. They were reasoning from exactly where they stood.

This doesn't make the veil useless. It makes it a corrective exercise rather than a complete framework. The value of the veil of ignorance isn't that it produces the "right" answer to algorithmic fairness. It's that it forces designers to seriously inhabit positions they'd otherwise ignore. A significant share of algorithmic harm arguably stems not from malice but from a failure of imagination: builders who never genuinely considered being on the receiving end of the system they built.

The veil of ignorance won't eliminate bias from algorithmic systems. But it offers something valuable: a structured way to ask, before you ship, whether you'd accept this system's decisions if you were the person most disadvantaged by them. If the answer is no, that's worth knowing before deployment, not after.

References

[1] John Rawls, A Theory of Justice, Harvard University Press, 1971. Revised edition, 1999. https://www.hup.harvard.edu/books/9780674000780

[2] National Council on Disability, "Quality-Adjusted Life Years and the Devaluation of Life with a Disability," November 6, 2019. https://www.ncd.gov/report/quality-adjusted-life-years-and-the-devaluation-of-life-with-a-disability/

[3] Cynthia Dwork et al., "Fairness Through Awareness," Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS), 2012, pp. 214–226. https://doi.org/10.1145/2090236.2090255

[4] John Harsanyi, "Can the Maximin Principle Serve as a Basis for Morality? A Critique of John Rawls's Theory," American Political Science Review, Vol. 69, No. 2, 1975, pp. 594–606. https://doi.org/10.2307/1959090

[5] Edmond Awad et al., "The Moral Machine experiment," Nature, Vol. 563, 2018, pp. 59–64. https://doi.org/10.1038/s41586-018-0637-6

[6] Lauren Rivera, "Hiring as Cultural Matching: The Case of Elite Professional Service Firms," American Sociological Review, Vol. 77, No. 6, 2012, pp. 999–1022. https://doi.org/10.1177/0003122412463213

[7] Michael Sandel, Liberalism and the Limits of Justice, Cambridge University Press, 1982. Second edition, 1998.