Share.

    5 Comments

    1. Source: Git commit history and git blame data extracted directly from the official GitHub repositories of major open-source projects including [React](https://github.com/facebook/react), [NumPy](https://github.com/numpy/numpy), [LangChain](https://github.com/langchain-ai/langchain), [Claude Code](https://github.com/anthropics/claude-code) and [Zed](https://github.com/zed-industries/zed)

      Tools: Python (ETL data pipeline and historical git blame extraction), GitHub Actions (automated monthly delta-processing), and React with Recharts for the interactive frontend visualization.

      Context: I wanted to explore the philosophical paradox of the Ship of Theseus applied to software engineering. If every line of code in a repository is eventually rewritten, is it still the same project? This stacked area chart shows the surviving lines of code categorized by the year they were originally written. As time moves forward on the X-axis, you can see the foundational code shrinking as it gets refactored and replaced.

      You can play with the interactive version and toggle between the different case studies here: [https://asifdotexe.github.io/Theseus/](https://asifdotexe.github.io/Theseus/)

      The source code for the automated data engine is here: [https://github.com/Asifdotexe/Theseus](https://github.com/Asifdotexe/Theseus)

    2. Why does reacts code basis collapse multiple times? What happened in 2019 and 2023-2024?

    3. How does old code disappear completely and come back later on? Is it reverted?

    4. OldSports-- on

      As a reader of philosophical texts, here are some perspectives on this topic:

      * **Mereological Essentialism**: The code base loses its identity as soon as the original code structure is altered through refactoring or the deletion of the initial commit.
      * **Spatiotemporal Continuity**: The code base remains identical as long as a continuous Git history and an uninterrupted development process exist within the same repository.
      * **Perdurantism**: The code base is understood as a four-dimensional object consisting of the sum of all its versions and developmental stages over time.
      * **Functionalism**: The code base maintains its identity through the unchanged API specification and the continued fulfillment of the defined software purpose.
      * **The Fork Dilemma**: A project reconstructed from the original source code competes with the modernized main version for the status of true identity.

    Leave A Reply