SPOTLIGHT

Goldman, Citi, and JPMorgan Are Testing Anthropic’s Claude Mythos — Project Glasswing Exposed

E Elena Volkov Apr 12, 2026 5 min read
Engine Score 8/10 — Important

This story reveals major financial institutions are testing Anthropic's advanced enterprise AI model, signaling a significant strategic development in AI adoption within the banking sector. Its high reliability, timeliness, and industry impact make it crucial for stakeholders to monitor.

Editorial illustration for: Goldman, Citi, and JPMorgan Are Testing Anthropic's Claude Mythos — Project Glasswing Exposed

On April 10, 2026, Bloomberg reported that Goldman Sachs Group, Citigroup, and JPMorgan Chase are conducting internal evaluations of Claude Mythos, Anthropic’s most capable enterprise AI model, under a confidential program called Project Glasswing. JPMorgan is the only institution formally named in the project; Goldman and Citi are identified among a broader cohort running parallel assessments. The Wall Street Claude Mythos testing is the most concentrated banking-sector AI evaluation since financial institutions began integrating large language models into capital markets workflows in 2023.

This isn’t exploratory curiosity. Banks at this tier don’t run enterprise AI evaluations without board-level sign-off and multi-million dollar procurement frameworks already in place.

What Project Glasswing Actually Is

Project Glasswing is Anthropic’s closed enterprise pilot for Claude Mythos — a structured evaluation program with select financial institutions ahead of any commercial release. The codename, a reference to the glasswing butterfly’s near-transparent wings, signals the program’s design: visible to the right counterparties, invisible to everyone else.

JPMorgan is the anchor participant. According to Bloomberg’s reporting, JPMorgan has embedded Mythos into three distinct workflow categories: trading desk research synthesis, compliance document review, and internal client brief generation. Goldman Sachs and Citigroup are running independent evaluations with different internal teams — neither formally inside the Glasswing structure, but both operating with Anthropic’s direct enterprise support.

Anthropic has not publicly confirmed Project Glasswing’s existence. That’s consistent with how enterprise AI deployments at this scale operate — OpenAI’s largest enterprise deals have similarly surfaced through reporting rather than press releases.

Wall Street Claude Mythos: What the Model Actually Does

Claude Mythos represents a step-change in context handling for financial applications. Where Claude 3.7 Sonnet supported document windows of approximately 200,000 tokens, Mythos reportedly extends this to 1 million tokens — sufficient to ingest an entire merger prospectus, trailing 10-K filings, and five years of earnings call transcripts simultaneously within a single context.

For Goldman’s investment banking division, which processes tens of thousands of pages per deal cycle, that compression matters materially. Analyst time spent on document extraction is one of the most expensive inputs in deal execution. Mythos doesn’t replace analysts — it eliminates the estimated 60% of review time spent on information retrieval, leaving human judgment for the decisions that actually require domain expertise.

Citigroup’s evaluation targets a different pain point: cross-border regulatory interpretation. Reconciling MiFID II, Dodd-Frank, and Basel III language across multinational deal structures is labor-intensive and error-prone. Mythos’s reported improvements in regulatory language parsing make it a credible tool for compliance teams that currently route routine interpretation work to expensive specialist counsel.

JPMorgan’s Formal Role in the Project

JPMorgan Chase CEO Jamie Dimon stated in his 2025 annual shareholder letter that the bank had integrated AI across more than 200 use cases in trading, risk management, and operations. Project Glasswing represents the next tier: moving from productivity augmentation to mission-critical workflow integration.

JPMorgan’s AI infrastructure team — which numbered over 2,000 data scientists and engineers according to the bank’s 2024 annual report — is running Mythos in a sandboxed environment with controlled data inputs. The evaluation targets three performance vectors: accuracy on financial document Q&A, latency at enterprise scale, and compliance with the bank’s internal AI governance framework, which requires models to meet specific explainability thresholds before production deployment.

The bank spent approximately $17 billion on technology in 2024. A model that meaningfully compresses analyst workflows across even one major business line justifies a substantial enterprise contract. The numbers favor moving fast.

Why Now: The Competitive Pressure Behind the Timing

Morgan Stanley’s integration of OpenAI models for wealth management, launched in 2023, has since expanded into portfolio construction and client analytics. Goldman’s internal AI code-generation tool, known internally as “Ducky,” has been live since 2023. The gap — and the pressure — is in document-heavy, high-stakes analytical work where general-purpose models previously fell short. Mythos is positioned directly at that gap.

Anthropic’s Constitutional AI framework, which trains models to adhere to a structured set of principles rather than optimizing purely for user approval, makes Mythos defensible in regulated environments. US and EU financial regulators have signaled increasing scrutiny of AI decision-making in credit evaluation and trading contexts. A model with documented safety architecture is considerably easier to defend to the OCC, FCA, or ECB than a system optimized for engagement without governance documentation.

This is also why Anthropic’s infrastructure decisions — and its track record around model architecture transparency — register with enterprise procurement teams. Financial institutions conducting technical due diligence want to understand how models are built, not just what they output on benchmarks.

The Enterprise AI Competition Wall Street Is Watching

Goldman, Citi, and JPMorgan aren’t choosing Mythos in a vacuum. The enterprise AI market for financial services is a three-way competition between Anthropic, OpenAI, and a cluster of specialized vendors including Bloomberg’s own BLP AI system and Palantir’s AIP platform.

OpenAI holds the brand recognition advantage and the largest developer ecosystem. Anthropic holds the safety narrative and, according to Bloomberg’s sources, a longer-context capability edge with Mythos. Specialized vendors hold compliance certifications and pre-built financial data integrations that general-purpose frontier models still lack.

What Project Glasswing signals is that JPMorgan has concluded frontier general-purpose models are now capable enough to compete with specialized systems on the dimensions that matter: accuracy, context length, and governance compliance. MegaOne AI tracks 139+ AI tools across 17 categories, and the pattern in enterprise deployments is consistent — the models that win at scale combine raw capability with defensible audit trails.

What This Means for Anthropic’s Valuation and Enterprise Strategy

Anthropic raised $7.3 billion in 2024 at a reported valuation of $61.5 billion. Enterprise contracts with Goldman Sachs, Citigroup, and JPMorgan — three of the five largest US banks by assets — would materially de-risk that valuation by anchoring recurring revenue from institutions with exceptionally low churn rates once systems are embedded in core workflows.

Financial services AI spending is projected to reach $97 billion annually by 2027, according to IDC’s 2025 AI spending forecast. The three banks currently evaluating Mythos collectively manage over $10 trillion in assets. Even a narrow slice of their AI procurement budget represents a transformative revenue anchor for Anthropic.

The competitive context matters here. OpenAI’s enterprise positioning has become more aggressive since its 2025 restructuring, and Meta’s open-source Llama models are creating pricing pressure at the lower end of the market. Anthropic’s play — premium model, premium price, regulated-industry credibility — requires precisely the kind of reference customer validation that Project Glasswing provides.

What Comes Next

Project Glasswing is a pilot, not a production deployment. The standard enterprise AI evaluation cycle at a major bank runs six to eighteen months from initial testing to contract signature. If JPMorgan’s evaluation began in Q1 2026, a production decision is realistic by late 2026 or early 2027 — with Goldman and Citi likely on similar timelines given parallel evaluations.

For the growing cohort of knowledge workers whose roles center on document-intensive analysis, the more consequential question isn’t whether these pilots succeed — it’s what banking institutions restructure once they do. Mythos, if it performs under production conditions as reported, compresses work that currently employs thousands of analysts across Wall Street’s most profitable divisions.

The banks aren’t testing Claude Mythos to understand AI. They’re testing it to understand how fast they can deploy it at scale without triggering a regulatory response. That’s a meaningfully different question — and it has a definite answer coming.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime