Metrics as Shadows: When the Dashboard Replaces Reality

In 1975, Charles Goodhart observed something that should have changed how we think about measurement forever: when a measure becomes a target, it ceases to be a good measure.^[1] The moment you optimize for a number, the number stops telling you what you thought it did.

This is Plato's Cave applied to data. Metrics are shadows cast by reality. They represent something real, like revenue, productivity, or user satisfaction, but they aren't the thing itself. And when we stare at the dashboard long enough, we forget we're looking at shadows.

The McNamara Fallacy

During the Vietnam War, U.S. Secretary of Defense Robert McNamara insisted on measuring progress through quantifiable metrics: enemy body counts, territory captured, sorties flown. The numbers looked good. The war was being lost.^[2]

The McNamara Fallacy, as it came to be known, follows a predictable pattern: measure what's easy to measure, ignore what isn't, presume what can't be measured isn't important, and then presume what can't be measured doesn't exist. It's a progression from convenience to blindness.

McNamara's team wasn't stupid. They were sophisticated analysts who built elaborate quantitative models. But they were watching shadows on the wall. The metrics captured something real (military activity) while missing what mattered (political will, civilian sentiment, strategic coherence). The dashboard showed progress. Reality showed otherwise.

This pattern repeats constantly in technology.

When the Metric Becomes the Mission

Consider lines of code as a productivity measure. The moment developers know they're being measured by output volume, the incentive shifts from solving problems efficiently to producing more code. Concise, elegant solutions become liabilities. Verbose implementations become assets. The metric rewards the opposite of what it was meant to capture.

Agile teams face a similar distortion with story points and velocity. Velocity was designed as a planning tool, a way for teams to estimate how much work they could take on in a sprint. But when management starts tracking velocity as a performance metric, teams learn to inflate estimates. A task that was three points becomes five. Velocity goes up. Actual throughput stays the same. Everyone watches the shadow improve while the reality hasn't changed.^[3]

Social media platforms optimize for engagement metrics: likes, shares, comments, time on platform. These are shadows of something harder to measure, like whether users actually found the experience valuable. Research has consistently shown that content provoking outrage generates more engagement than content that informs or satisfies.^[4] The metric (engagement) diverges from the intended outcome (user value), and the platform optimizes for the metric.

Teaching to the Test

Education offers one of the clearest examples. When school funding and teacher evaluations depend on standardized test scores, the rational response is to teach to the test. Students learn test-taking strategies instead of deep understanding. Schools narrow their curricula to tested subjects. The metric (test scores) can improve while the thing it's supposed to measure (learning) stagnates or declines.^[5]

This is Plato's Cave in miniature. The test score is a shadow of learning. It captures something real, but it's a simplification. When we treat the shadow as the thing itself, we start optimizing the shadow and neglecting the reality it was meant to represent.

The same dynamic plays out in healthcare. When hospitals are penalized for high readmission rates, some respond by keeping patients longer than necessary or reclassifying readmissions. The metric improves. Patient outcomes may not.^[6]

The Dashboard as Cave Wall

Modern management increasingly happens through dashboards. Executives watch real-time metrics: revenue, conversion rates, customer acquisition costs, churn rates, NPS scores. These numbers feel objective, precise, authoritative.

But every dashboard is a cave wall. Someone decided which metrics to display and which to omit. Someone chose the time range, the aggregation method, the visualization. The dashboard shows you a curated set of shadows and invites you to mistake them for the complete picture.

What the dashboard typically doesn't show: the customer who stayed but is quietly unhappy. The employee who hits every KPI but demoralizes the team. The product that meets every metric but solves the wrong problem. The market shift that hasn't shown up in the numbers yet.

GDP offers a macro-scale example. It measures economic activity but not wellbeing. A country can increase GDP through activities that make its citizens worse off: longer working hours, environmental degradation, healthcare spending driven by illness rather than health. The metric goes up. The thing it's supposed to represent doesn't necessarily follow.^[7]

The Optimization Trap

A maze of feedback loops and distorted metrics spiraling inward, representing the self-defeating cycle of optimizing for the wrong target — The optimization trap: when you point a powerful process at the wrong target, it finds shortcuts you didn't anticipate.

The deeper problem isn't that metrics are imperfect. Of course they are. The problem is what happens when we optimize for them.

Optimization is powerful. When you point an optimization process at a target, it will find ways to hit that target that you didn't anticipate and might not want. In machine learning, this is called reward hacking or specification gaming: the system finds unexpected shortcuts to maximize the metric without achieving the intended goal.^[8]

Humans do the same thing. Warehouse workers measured by items picked per hour skip safety protocols. Sales teams measured by quarterly revenue discount heavily in the last week of the quarter, borrowing from future sales. Engineers measured by tickets closed split work into smaller tickets.

Each individual response is rational. Each degrades the connection between the metric and the reality it was meant to capture. The shadow drifts further from the object casting it.

Seeing Past the Dashboard

Goodhart's Law doesn't mean measurement is useless. It means measurement requires humility.

A few principles help:

Measure multiple things. A single metric is a single shadow. Multiple metrics from different angles give you something closer to a three-dimensional picture. When metrics disagree, that's information.

Go to the source regularly. Talk to customers, not just their NPS scores. Read the code, not just the velocity chart. Walk the floor, not just the dashboard. Direct observation is how you calibrate whether your shadows still resemble reality.

Rotate your metrics. When a metric becomes a target, its usefulness decays. Changing what you measure periodically prevents the optimization trap from fully taking hold.

Measure what matters, not what's easy. The most important things, like learning, satisfaction, trust, and long-term health, are often the hardest to quantify. That doesn't mean they don't exist. It means your dashboard is showing you the easy shadows and hiding the hard ones.

Plato's prisoners didn't know they were watching shadows because shadows were all they'd ever seen. The executive who lives in the dashboard, the teacher who teaches to the test, the platform that optimizes for engagement: they're all watching shadows and calling it reality.

The metric is useful. The dashboard is useful. Just don't mistake them for the thing they represent.

The shadows on the wall can tell you something about reality. They just can't tell you everything. And the gap between the shadow and the thing casting it is where the important stuff lives.

References

[1] Charles Goodhart, "Problems of Monetary Management: The U.K. Experience," Papers in Monetary Economics, Reserve Bank of Australia, 1975.

[2] H.R. McMaster, Dereliction of Duty: Lyndon Johnson, Robert McNamara, the Joint Chiefs of Staff, and the Lies That Led to Vietnam, HarperCollins, 1997.

[3] Ron Jeffries, "Story Points Revisited," ronjeffries.com, February 2019. https://ronjeffries.com/articles/019-01ff/story-points/Index.html

[4] William J. Brady et al., "Emotion shapes the diffusion of moralized content in social networks," Proceedings of the National Academy of Sciences, Vol. 114, No. 28, July 2017. https://doi.org/10.1073/pnas.1618923114

[5] Daniel Koretz, The Testing Charade: Pretending to Make Schools Better, University of Chicago Press, 2017.

[6] Karen E. Joynt and Ashish K. Jha, "Thirty-Day Readmissions — Truth and Consequences," New England Journal of Medicine, Vol. 366, April 2012. https://doi.org/10.1056/NEJMp1201598

[7] Joseph Stiglitz, Amartya Sen, and Jean-Paul Fitoussi, Report by the Commission on the Measurement of Economic Performance and Social Progress, 2009. https://ec.europa.eu/eurostat/documents/8131721/8131772/Stiglitz-Sen-Fitoussi-Commission-report.pdf

[8] Victoria Krakovna et al., "Specification gaming: the flip side of AI ingenuity," DeepMind Blog, April 2020. https://deepmindsafetyresearch.medium.com/specification-gaming-the-flip-side-of-ai-ingenuity-c85bdb0deeb4