RESEARCH

GPT-5.2 and Claude Opus 4.6 Both Go Silent on ‘Ontologically Null’ Prompts, Study Finds

M megaone_admin Mar 22, 2026 2 min read
Engine Score 8/10 — Important

This research introduces novel technical concepts regarding potential failure modes in future advanced LLMs, such as GPT-5.2 and Claude Opus 4.6. It is highly important for AI researchers and developers to understand these limitations for robust model development.

Editorial illustration for: GPT-5.2 and Claude Opus 4.6 Both Go Silent on 'Ontologically Null' Prompts, Study Finds

A preprint published on Zenodo on March 12 by researcher Rayan Pal documents an unusual behavioral convergence between two independently developed frontier language models. When prompted to “embody” ontologically null concepts — silence, nothing, void, null — both OpenAI’s GPT-5.2 and Anthropic’s Claude Opus 4.6 consistently produce empty output rather than generating text. The result held across 180 out of 180 trials at temperature 0, with both models returning void on all 90 respective test prompts.

The paper distinguishes this behavior from standard refusal or safety filtering. When given control prompts — requests to embody concrete concepts like “a cat” or “the wind” — both models respond normally with generated text. The silence is specific to prompts asking the models to take on the identity of concepts that, by definition, have no content to express. Pal terms this a “semantic void convergence,” suggesting that the models independently arrived at a shared boundary where continuation is not possible rather than not permitted.

The experimental design tested several conditions: token-budget independence (the silence occurs regardless of how many tokens are allocated), partial adversarial resistance (attempts to force output through prompt engineering were largely unsuccessful), and boundary expansion under explicit silence permission (telling the model it is “allowed” to be silent did not change the behavior, confirming it is not a safety-layer decision). The preprint carries DOI 10.5281/zenodo.18976656 and has been posted as an open-access document.

The finding raises technical questions about what frontier models learn about semantic representation during training. If both GPT-5.2 and Claude Opus 4.6 — built by different companies, on different architectures, with different training data — converge on the same behavior for the same class of prompts, it suggests the behavior may be an emergent property of scale rather than a design choice. Researchers in the AI safety community have noted the result as evidence that large language models develop internal boundaries that are not fully explained by their training objectives or RLHF alignment.

The practical implications are narrow but theoretically significant. The study does not suggest the models are “aware” of silence in any meaningful sense, but it does demonstrate that there are prompt categories where deterministic non-generation occurs across model families. For developers building applications that rely on guaranteed output from LLM calls, the finding is a reminder that edge cases in semantic space can produce behaviors that neither documentation nor safety cards currently describe.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime

M
MegaOne AI Editorial Team

MegaOne AI monitors 200+ sources daily to identify and score the most important AI developments. Our editorial team reviews 200+ sources with rigorous oversight to deliver accurate, scored coverage of the AI industry. Every story is fact-checked, linked to primary sources, and rated using our six-factor Engine Score methodology.

About Us Editorial Policy