heretic-org/Qwen3-4B-Thinking-2507-heretic
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 15, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The heretic-org/Qwen3-4B-Thinking-2507-heretic is a 4.0 billion parameter causal language model, based on the Qwen3 architecture by Qwen, with a native context length of 262,144 tokens. This model is a decensored version of the original Qwen3-4B-Thinking-2507, specifically optimized for highly complex reasoning tasks across logical reasoning, mathematics, science, and coding. It features significantly improved performance on these thinking capabilities and enhanced long-context understanding, making it suitable for applications requiring deep analytical processing.

Loading preview...

What is heretic-org/Qwen3-4B-Thinking-2507-heretic?

This model is a 4.0 billion parameter causal language model, derived from the Qwen3-4B-Thinking-2507 by Qwen, and has been decensored using the Heretic v1.2.0 tool. It is specifically designed to excel in complex reasoning tasks, building upon the original Qwen3's focus on "thinking capability."

Key Capabilities & Enhancements

  • Enhanced Reasoning: Demonstrates significantly improved performance across logical reasoning, mathematics, science, and coding tasks, which typically demand human-level expertise.
  • Long Context Understanding: Features an enhanced native context length of 262,144 tokens, crucial for processing extensive information during complex reasoning.
  • Decensored Output: Compared to the original model, this "heretic" version shows a substantial reduction in refusals (5/100 vs. 99/100), indicating a less restrictive output generation.
  • General Capabilities: Offers markedly better instruction following, tool usage, text generation, and alignment with human preferences.
  • Agentic Use: Excels in tool calling, with recommendations to use Qwen-Agent for optimal agentic performance.

When to Use This Model

This model is strongly recommended for highly complex reasoning tasks where deep analytical processing and extensive context understanding are critical. Its decensored nature may also be beneficial for use cases requiring less restrictive content generation. It is particularly well-suited for applications in academic research, advanced problem-solving, and agent-based systems that leverage tool use.