p-e-w/Qwen3-8B-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 11, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The p-e-w/Qwen3-8B-heretic is an 8.2 billion parameter causal language model, a decensored version of the original Qwen3-8B, created using the Heretic v1.1.0 tool. It features a 32,768 token context length, extendable to 131,072 tokens with YaRN, and uniquely supports seamless switching between a "thinking mode" for complex reasoning (math, code) and a "non-thinking mode" for general dialogue. This model excels in reasoning, instruction-following, agent capabilities, and multilingual support across 100+ languages, with significantly reduced refusals compared to its original counterpart.

Loading preview...

Qwen3-8B-heretic: Decensored Qwen3 with Enhanced Control

p-e-w/Qwen3-8B-heretic is an 8.2 billion parameter causal language model, derived from the original Qwen3-8B and modified using the Heretic v1.1.0 tool to be a decensored version. This model maintains the robust capabilities of the Qwen3 series while offering a significantly reduced refusal rate (45/100 compared to 100/100 for the original model).

Key Capabilities

  • Dual-Mode Operation: Seamlessly switches between a "thinking mode" for complex logical reasoning, mathematics, and code generation, and a "non-thinking mode" for efficient, general-purpose dialogue. This allows for optimal performance across diverse tasks.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning, surpassing previous Qwen models in both thinking and non-thinking modes.
  • Superior Human Preference Alignment: Excels in creative writing, role-playing, multi-turn conversations, and instruction following, providing a more natural and engaging user experience.
  • Advanced Agent Capabilities: Integrates precisely with external tools in both thinking and unthinking modes, achieving leading performance among open-source models in complex agent-based tasks.
  • Multilingual Support: Supports over 100 languages and dialects, offering strong capabilities for multilingual instruction following and translation.
  • Extended Context Length: Natively handles up to 32,768 tokens, with validated performance up to 131,072 tokens using the YaRN method for long text processing.

Good for

  • Applications requiring a decensored model with advanced reasoning and instruction-following.
  • Scenarios benefiting from dynamic switching between detailed analytical thought and concise conversational responses.
  • Complex agent-based systems and tool-use integrations.
  • Multilingual applications, including translation and instruction following across many languages.
  • Tasks demanding long context understanding, such as summarizing extensive documents or handling prolonged conversations.