threlfall-hax/Qwen3.6-27B-OBLITERATED
threlfall-hax/Qwen3.6-27B-OBLITERATED is a 27 billion parameter Qwen3.6-based model with a 32,768 token context length, specifically modified to remove refusal behaviors. Developed by threlfall-hax, this model utilizes a pytorch_hooks_mmd method to abliterate safety guardrails, achieving 100% compliance on adversarial prompts where the base model refused. It is intended for research purposes requiring an uncensored large language model.
Loading preview...
Qwen3.6-27B-OBLITERATED: An Abliterated Qwen3.6 Variant
This model, developed by threlfall-hax, is a 27 billion parameter version of Qwen/Qwen3.6-27B that has undergone "abliteration" to remove its refusal behaviors and safety guardrails. The modification was performed using the pytorch_hooks_mmd method, which involves applying orthogonal projection to specific model weights (layers 32-63, o_proj and down_proj) to suppress learned refusal tendencies while preserving general capabilities.
Key Characteristics & Performance
- Uncensored Output: Achieves a 100% compliance rate on adversarial prompts that typically trigger refusal in the base Qwen3.6 model, demonstrating complete removal of refusal behaviors.
- Architecture: Based on Qwen3_5ForConditionalGeneration, configured as text-only with visual encoder weights removed during abliteration.
- Context Length: Supports a substantial 32,768 tokens, extendable up to 262,144 tokens.
- Precision: Operates at bfloat16 precision.
Usage and Limitations
- Serving Requirement: For the abliteration to be preserved, the model must be served via vLLM (version >= 0.23.0) or directly with the
transformerslibrary. GGUF conversion (e.g., via llama.cpp or Ollama) does not correctly preserve the modified weights. - vLLM Patch: A minor patch to vLLM is required to skip validation for missing visual encoder weights, as these components were removed.
- Research Use Only: This model is provided for research purposes. Users are responsible for its appropriate use, as the removal of safety guardrails means it can generate harmful content.