OpenMythos Proposes Claude Mythos Runs on Looped Transformer

OpenMythos is explicitly a theoretical reconstruction — it is not a leaked, fine-tuned, or distilled version of any Anthropic model.
The project proposes Claude Mythos uses a Recurrent-Depth Transformer (RDT) design, where a fixed set of weights is applied iteratively up to 16 times per forward pass rather than through stacked unique layers.
At 770 million parameters, OpenMythos claims representational capacity comparable to a 1.3 billion parameter conventional transformer — a claim that has not been independently benchmarked.
Anthropic has not published any technical specification for Claude Mythos, leaving the reconstruction unverifiable against official design documentation.

What Happened

Developer Kye Gomez published OpenMythos to GitHub on April 19, 2026 — an open-source PyTorch project that proposes a specific, falsifiable architectural hypothesis for Anthropic’s Claude Mythos model. The release was covered by MarkTechPost the same day. Anthropic has not published a technical paper or architecture specification for Claude Mythos.

According to the project’s GitHub repository, OpenMythos is “a hypothesis rendered in code — and the hypothesis is specific enough to be falsifiable.” The project explicitly states it is “not a leaked model, a fine-tune, or a distillation,” positioning it as a research exercise grounded in peer-reviewed literature rather than internal access to Anthropic’s systems.

Why It Matters

Anthropic publishes safety research and model evaluations but has withheld architecture-level documentation for its Claude model family, creating a gap that independent researchers have begun filling with theoretically grounded reconstructions. OpenMythos follows a broader pattern in which open-source developers use published academic literature to approximate the design choices of frontier closed-source models.

The architectural framing the project proposes — iterative computation over a fixed set of weights — connects to an active area of academic research into compute-efficient transformers. If the RDT hypothesis is correct, it would suggest frontier model capability can be achieved through inference-time compute scaling rather than simply adding parameters at training time, a design principle with direct relevance to how research teams structure model budgets.

Technical Details

OpenMythos proposes that Claude Mythos belongs to the Recurrent-Depth Transformer (RDT) class — also referenced in academic literature as Looped Transformers — in which a fixed set of weights is applied iteratively across multiple loop steps within a single forward pass, rather than through a sequence of unique layers each with independent weights.

The architecture divides computation into three sequential stages: a Prelude of standard transformer layers run exactly once, a Recurrent Block looped up to T=16 times, and a Coda of standard transformer layers run once at the end. At each loop step t, the hidden state updates according to the rule h(t+1) = A·h(t) + B·e + Transformer(h(t), e), where e is the encoded input from the Prelude. The explicit re-injection of e at each iteration is a deliberate design choice intended to prevent representational drift across loop steps.

The project claims this design allows a 770 million parameter model to achieve representational capacity comparable to a 1.3 billion parameter standard transformer — an efficiency ratio of approximately 1.7x. No independent benchmark comparisons have been published to validate this figure. Gomez derived the parameter count and loop depth through inference from published RDT research, not from any Anthropic technical disclosure.

Who’s Affected

AI researchers studying parameter efficiency and inference-time compute allocation gain a new, testable architectural hypothesis: if the RDT framing is accurate, it would reframe how teams assess the performance-to-parameter tradeoff between larger static models and smaller iterative ones. Developers and organizations using Claude Mythos through the Anthropic API are not directly affected — OpenMythos does not replicate Claude Mythos weights, output behavior, or training data.

Hardware and inference infrastructure teams may find the RDT design relevant regardless of whether Anthropic actually uses it, since the architecture’s inference-time looping carries distinct implications for latency, memory bandwidth, and batch throughput compared to standard transformer deployment.

What’s Next

OpenMythos is available on GitHub for community review, benchmarking, and contribution. The project’s central efficiency claim — that 770 million parameters match the representational capacity of a 1.3 billion parameter transformer — requires independent empirical validation against standard baselines before it can be assessed.

Anthropic has not commented on the project. Whether the recurrent-depth framing corresponds to the actual architecture of Claude Mythos cannot be confirmed without documentation that Anthropic has not publicly released.

OpenMythos Proposes Claude Mythos Runs on Looped Transformer Architecture

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

OpenMythos Proposes Claude Mythos Runs on Looped Transformer Architecture

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

SpaceX Signs $920 Million-a-Month Deal to Supply Google 110,000 Nvidia Chips

xAI Reportedly Trained Its Coding Models on Claude Outputs for Months

Meta’s First Paid AI Product, Hatch, Could Cost Up to $200 a Month