OpenAI's Privacy Filter for PII Detection in Text

OpenAI released Privacy Filter on April 22, 2026, an open-weight PII detection model available under the Apache 2.0 license on Hugging Face.
The 1.5B-parameter model achieves a 97.43% F1 score on a corrected version of the PII-Masking-300k benchmark, with 96.79% precision and 98.08% recall.
The model runs locally, supports up to 128,000 tokens of context, and labels all tokens in a single forward pass using a bidirectional token-classification architecture.
OpenAI uses a fine-tuned version internally and is releasing the base model for developers to adapt and fine-tune under a permissive open-source license.

What Happened

OpenAI released Privacy Filter on April 22, 2026, an open-weight model for detecting and redacting personally identifiable information in unstructured text. The model is available immediately on Hugging Face under the Apache 2.0 license. The announcement carried no individual author byline and was attributed to OpenAI as an organization.

OpenAI stated it already deploys a fine-tuned version of Privacy Filter in its own privacy-preserving workflows, writing that the project began because “we believe that with the latest AI capabilities, we could raise the standard for privacy beyond what was already on the market.”

Why It Matters

Traditional PII detection tools rely on deterministic rules for structured formats like phone numbers and email addresses, which frequently miss contextually sensitive personal information in unstructured prose. Privacy Filter adds a capable, locally deployable option to a field that has historically depended on either cloud-based APIs or rule-based systems with limited recall.

The release joins existing open tools such as Microsoft’s Presidio, which combines rule-based and lightweight ML-based detectors. OpenAI’s entry distinguishes itself with a 128,000-token context window, single-pass token classification, and a formal evaluation against the PII-Masking-300k benchmark.

Technical Details

Privacy Filter is a bidirectional token-classification model with span decoding. It begins from an autoregressive pretrained checkpoint, which is then adapted into a token classifier by replacing the language modeling head with a classification head. At inference time, token-level predictions are decoded into coherent labeled spans using a constrained Viterbi procedure, enabling a single forward pass over the full input sequence. The released model has 1.5 billion total parameters with 50 million active parameters.

On the PII-Masking-300k benchmark, the model achieves an F1 score of 96% (94.04% precision, 98.04% recall). On a corrected version of the same benchmark—adjusted for annotation issues OpenAI identified during evaluation—the F1 score rises to 97.43% (96.79% precision, 98.08% recall). The model detects eight PII categories: private_person, private_address, private_email, private_phone, private_url, private_date, account_number, and secret, with the secret category covering passwords and API keys.

OpenAI also evaluated domain adaptation: fine-tuning on a small domain-specific dataset improved F1 from 54% to 96% on a domain-adaptation benchmark, approaching saturation with limited labeled examples. The model card additionally reports targeted evaluation on secret detection in codebases, and stress tests across multilingual, adversarial, and context-dependent examples.

Who’s Affected

Developers building training, indexing, logging, and document-review pipelines that handle sensitive text can now run a production-grade PII filter locally, without routing unfiltered data through a third-party API. The Apache 2.0 license permits commercial use and modification without restriction.

Organizations in legal, medical, and financial sectors are the stated target audience for the model’s eight-category taxonomy and long-context support, though OpenAI’s model card explicitly states that Privacy Filter “is not an anonymization tool, a compliance certification, or a substitute for policy review in high-stakes settings” and that human review remains necessary in sensitive domains. Performance may also vary across languages, naming conventions, and domains that differ from the training distribution.

What’s Next

The model is available immediately on Hugging Face alongside the published model card. OpenAI has not announced a scheduled update cadence or plans for a larger variant of the model.

Developers requiring different detection thresholds can configure the model’s operating points to trade off recall and precision by workflow, or fine-tune the released weights on domain-specific data using the standard Apache 2.0 release.

OpenAI Releases Open-Weight Privacy Filter for PII Detection in Text

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

OpenAI Releases Open-Weight Privacy Filter for PII Detection in Text

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

OpenAI Confidentially Files for IPO, a Week After Anthropic

Google Releases Gemini 3.5 Live Translate for Real-Time Speech-to-Speech

SpaceX Signs $920 Million-a-Month Deal to Supply Google 110,000 Nvidia Chips