Independent developer HauhauCS has released an uncensored variant of Alibaba’s Qwen3.5-122B-A10B model in GGUF format, designed to remove all built-in content restrictions while maintaining the original model’s capabilities. The release, posted to the LocalLLaMA community on March 22, includes new K_P quantizations that allow the 122-billion-parameter mixture-of-experts model to run on consumer hardware with reduced memory requirements.
The Qwen3.5-122B-A10B architecture uses a mixture-of-experts design with 122 billion total parameters but only 10 billion active parameters per inference pass. This makes it substantially more efficient to run than dense models of comparable quality, and the GGUF format — a binary format optimized for llama.cpp and similar local inference engines — enables deployment on machines with as little as 16GB of RAM when using aggressive quantization. The K_P quantization variants introduced in this release offer different tradeoffs between model quality and memory footprint.
Uncensored model releases have become a persistent feature of the open-source AI ecosystem. When companies like Alibaba, Meta, or Mistral release models with safety guardrails, community developers routinely produce variants with those restrictions removed — typically through fine-tuning on datasets designed to override refusal behaviors. HauhauCS describes this release as achieving “zero capability loss” relative to the original Qwen3.5-122B, meaning the uncensoring process preserved the model’s reasoning and knowledge capabilities while eliminating its tendency to decline requests.
The demand for uncensored models is driven by legitimate use cases — medical professionals querying about drug interactions, security researchers testing adversarial prompts, fiction writers generating content that triggers safety filters — as well as by users who simply prefer models without opinionated guardrails. The LocalLLaMA community, which has grown into the primary distribution channel for quantized and modified open-weight models, treats uncensored releases as a standard category alongside official model releases.
For Alibaba, community-produced variants like this one represent the double-edged nature of open-source AI. The 170,000 derivative models built on Qwen include uncensored versions that Alibaba neither endorses nor controls. This dynamic is identical to what Meta faces with Llama — open weights enable innovation and adoption, but also remove the releasing company’s ability to enforce usage policies after distribution.
