The Big Rewrite: When Your Codebase Loses Its Identity

In April 2000, Joel Spolsky published what would become one of the most cited essays in software engineering: "Things You Should Never Do, Part I." His argument was simple and devastating. Netscape had decided to rewrite their browser from scratch. Version 4.0 shipped in 1997. Version 6.0 didn't arrive until 2000—three years of silence while Internet Explorer devoured their market share.^[1] The rewrite killed Netscape.

Spolsky's lesson was clear: never throw away working code. But the deeper question is one he didn't ask. When Netscape 6.0 finally shipped, built entirely from new code with none of the original remaining, was it still Netscape? The name was the same. The purpose was the same. But the code—every line of it—was different. The ship had been rebuilt from scratch, and it sank.

The Strangler Fig Alternative

In 2004, Martin Fowler proposed a different approach, inspired by strangler fig trees in Australia.^[2] These trees germinate in the canopy of a host tree and slowly grow roots down the trunk. Over years, the fig surrounds and eventually replaces the host entirely. The host tree dies, but the fig stands in its place, occupying the same space, performing the same ecological role.

Fowler's strangler fig pattern applies this to software: instead of rewriting a system all at once, you build new functionality around the edges of the old system. You intercept calls, redirect traffic, and replace components one at a time. The old system shrinks as the new one grows, until eventually the legacy code can be switched off.

This is the Ship of Theseus enacted as engineering strategy. You replace one plank at a time, keeping the ship sailing throughout. At the end, no original code remains—but the system never stopped working, never lost its users, never broke its contracts.

The question remains: is the strangled system the same system?

Two Kinds of Rewrite

Software rewrites generally fall into two categories, and each poses a different identity problem.

The big bang rewrite is Netscape's approach: stop everything, start over, ship the new version when it's ready. The old codebase and the new codebase exist simultaneously during development, but only one is ever in production. This is Hobbes's scenario—you've built a second ship from new wood while the original sits in the harbor. When you launch the new one, you're asking users to accept that it's the same product.

The incremental rewrite is the strangler fig: replace pieces gradually while the system runs. This is Plutarch's original scenario—planks replaced one at a time, the ship never leaving the water. At no single point does the system "become" something new. The transformation is continuous.

Most engineers instinctively feel that the incremental approach preserves identity better. There's something about continuity—the unbroken chain of deployments, the git history stretching back to the first commit—that feels like it maintains the system's selfhood. But this is worth examining. If the end result is identical—every line of code replaced—why does the path matter?

The Git History as Narrative

Consider what git history really is. It's a record of every change, every decision, every bug fix. It's the story of how the code got to where it is. When you do an incremental rewrite, that story is continuous. Commit by commit, you can trace the evolution from old to new.

When you do a big bang rewrite, the story breaks. There's a gap—a new repository, a fresh initial commit. The narrative of the codebase starts over.

This matters more than it might seem. When a developer joins a team and runs git blame on a confusing line of code, they're reading the ship's logbook. They're asking: why is this plank here? What storm required this repair? A continuous history answers those questions. A fresh repository doesn't.

In the narrative theory of identity, the ship is its story. The git history is that story for software. Break the history, and you've arguably created a new system—even if it behaves identically.

When Twitter Changed Its Planks

Twitter's migration from Ruby on Rails to Scala and Java is one of the most instructive examples of incremental identity transformation in software.^[3] The platform launched in 2006 as a Rails monolith. By 2008, the famous "fail whale" was appearing regularly as the system buckled under load. Something had to change.

But Twitter didn't rewrite everything at once. They started by moving their message queue to Scala. Then backend services. Then the search infrastructure. Then the core timeline serving. Piece by piece, over several years, the Ruby was replaced with JVM-based services.

By 2013, virtually no Ruby remained in the critical path.^[4] The language changed. The architecture changed. The infrastructure changed. But Twitter never went offline, never relaunched, never asked users to migrate. From the outside, it was the same product the entire time.

Was it? The code was entirely different. The language was different. The architecture had shifted from monolith to microservices. The only things that persisted were the API contracts, the user experience, and the name. The planks were all new. But the ship never left the harbor.

Slack's Philosophical Rewrite

Slack's desktop rebuild in 2019 is fascinating because the team explicitly grappled with the identity question. Their engineering blog post was titled "When a rewrite isn't: rebuilding Slack on the desktop."^[5] The title itself is a philosophical claim—they rebuilt the entire application, but they're arguing it wasn't a rewrite.

Their approach was hybrid. They rebuilt the desktop client from scratch using a modern React-based architecture, but they did it incrementally—shipping new components alongside old ones, migrating users gradually, keeping the old and new code running in the same application. The result launched 33 percent faster and used half the memory.

Slack's engineers were making an implicit argument about identity: the system's identity lives in its behavior and its contracts with users, not in its code. They replaced every component, but because the user experience remained continuous, they claimed it wasn't a "new" application. It was the same Slack, rebuilt.

This is the functional theory of identity applied to software. If it behaves the same way, serves the same purpose, and maintains the same relationships with its users, it's the same system—regardless of what's under the hood.

Semantic Versioning as Identity Contract

The software industry has actually formalized a theory of identity, though most developers don't think of it that way. Semantic versioning—the system of MAJOR.MINOR.PATCH version numbers—is an identity contract.^[6]

A patch version says: we fixed something, but the ship is the same. A minor version says: we added something, but the ship is still the same. A major version says: we changed something fundamental—the ship might be different now.

The major version bump is the closest thing software has to an official identity boundary. When Python went from 2 to 3, the community spent over a decade debating whether it was the "same language." Code written for Python 2 often wouldn't run on Python 3. The syntax changed. The standard library changed. The philosophy around strings and Unicode changed fundamentally.

Yet it was still called Python. The name persisted. The community persisted. The core design philosophy persisted. Was it the same language? The semantic version said: maybe not. The name said: yes. The community was split.

Where Identity Lives in Code

The Ship of Theseus forces software engineers to ask: what is the essential thing about a system?

If identity lives in the code, then any sufficiently large refactor creates a new system. This seems wrong—nobody thinks a codebase becomes "different software" after a round of cleanup.

If identity lives in the behavior, then a perfect reimplementation in a different language is the same system. This also seems wrong—most people would say a Python rewrite of a Java application is a different codebase, even if it does the same thing.

If identity lives in the contracts—the APIs, the interfaces, the promises made to users and other systems—then you can replace everything behind the contract and maintain identity. Break the contract, and you've created something new, even if you only changed one line.

If identity lives in the narrative—the git history, the team, the name, the community—then continuity of story matters more than continuity of code.

In practice, software identity is probably all of these things at once, weighted differently depending on context. A library's identity lives primarily in its API. A product's identity lives primarily in its user experience. An open source project's identity lives primarily in its community. A legacy system's identity lives primarily in its behavior—the specific bugs and quirks that downstream systems depend on.

The Lesson of the Rewrite

Every software rewrite is a Ship of Theseus experiment. The question isn't whether to replace the planks—entropy guarantees you'll have to. The question is what you're trying to preserve.

Netscape tried to preserve the name while replacing everything else. It failed because the three-year gap broke continuity—the narrative stopped, and competitors filled the silence.

Twitter preserved continuity by replacing planks one at a time. It succeeded because the ship never stopped sailing, even as every plank was swapped.

Slack preserved behavior while replacing implementation. It succeeded because users never noticed the transformation—the contracts held.

The lesson isn't that rewrites are good or bad. It's that identity in software is a choice. You decide what matters—code, behavior, contracts, narrative—and you preserve that while letting everything else change.

Every git push is a plank being replaced. The question is whether you know which planks are load-bearing.

References

[1] Joel Spolsky, "Things You Should Never Do, Part I," Joel on Software, April 6, 2000. https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/

[2] Martin Fowler, "StranglerFigApplication," martinfowler.com, June 29, 2004. https://martinfowler.com/bliki/StranglerFigApplication.html

[3] Bill Venners, "Twitter on Scala," Artima, April 3, 2009. https://www.artima.com/articles/twitter-on-scala

[4] "Twitter gives up Ruby for Java," i-programmer.info, April 2013. https://i-programmer.info/news/80-java/2319-twitter-give-up-ruby-for-java.html

[5] Mark Christian and Johnny Rodgers, "When a rewrite isn't: rebuilding Slack on the desktop," Several People Are Coding, July 22, 2019. https://slack.engineering/rebuilding-slack-on-the-desktop/

[6] Tom Preston-Werner, "Semantic Versioning 2.0.0," semver.org. https://semver.org/