yekon9/Qwen3-4B-Instruct-2507-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The yekon9/Qwen3-4B-Instruct-2507-heretic is a 4.0 billion parameter causal language model, based on the Qwen3 architecture, with a native context length of 262,144 tokens. This model is a decensored version of Qwen/Qwen3-4B-Instruct-2507, created using the Heretic tool, significantly reducing refusals compared to the original. It demonstrates enhanced capabilities in instruction following, logical reasoning, mathematics, coding, and agentic tool usage, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Model Overview

yekon9/Qwen3-4B-Instruct-2507-heretic is a 4.0 billion parameter causal language model, derived from the Qwen3-4B-Instruct-2507 base model. Its primary distinction is being a decensored version, achieved using the Heretic v1.2.0 tool, which significantly reduces content refusals from 100/100 in the original to 7/100 in this variant. The model boasts a substantial native context length of 262,144 tokens, though a context length of 32,768 is recommended to mitigate potential out-of-memory issues.

Key Capabilities

  • Decensored Output: Offers significantly fewer content refusals compared to its base model, enabling broader content generation.
  • Enhanced General Abilities: Shows improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
  • Long-Context Understanding: Supports a native context length of 262,144 tokens, beneficial for processing extensive inputs.
  • Agentic Functionality: Excels in tool calling, with recommendations to use Qwen-Agent for optimal agentic performance.
  • Multilingual Support: Demonstrates substantial gains in long-tail knowledge coverage across multiple languages.

Performance Highlights

This model shows strong performance across various benchmarks, often outperforming the original Qwen3-4B-Instruct-2507 in categories like reasoning (e.g., AIME25: 47.4 vs 19.1), coding (e.g., MultiPL-E: 76.8 vs 66.6), and alignment (e.g., Creative Writing v3: 83.5 vs 53.6). Its decensored nature is quantified by a Refusals metric of 7/100 compared to 100/100 for the original model.

Good For

  • Applications requiring less restrictive content generation or exploration of diverse topics.
  • Tasks benefiting from extended context windows, such as summarization of long documents or complex conversations.
  • Agentic workflows and tool-use scenarios, leveraging its strong tool-calling capabilities.
  • Multilingual applications and tasks demanding strong logical reasoning and coding abilities.