Channyxox/Qwen3-4B-Instruct-2507-heretic
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 13, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Channyxox/Qwen3-4B-Instruct-2507-heretic is a 4.0 billion parameter causal language model, based on the Qwen3-4B-Instruct-2507 architecture, developed by Channyxox. This model is a decensored variant, created using the Heretic tool, specifically optimized to reduce refusals and enhance open-ended responses compared to its original counterpart. It features a native context length of 262,144 tokens and excels in instruction following, logical reasoning, and long-tail knowledge coverage across multiple languages.

Loading preview...

Channyxox/Qwen3-4B-Instruct-2507-heretic: A Decensored Qwen3 Variant

This model is a decensored version of the Qwen/Qwen3-4B-Instruct-2507, created by Channyxox using the Heretic v1.2.0 tool. Its primary distinction lies in its significantly reduced refusal rate, achieving 16/100 refusals compared to the original model's 100/100, making it more suitable for open-ended and less constrained generative tasks.

Key Capabilities & Enhancements

Based on the Qwen3-4B-Instruct-2507, this model inherits and enhances several core capabilities:

  • Instruction Following & Reasoning: Demonstrates improved performance in general instruction following, logical reasoning, text comprehension, mathematics, science, and coding.
  • Long-Tail Knowledge: Offers substantial gains in knowledge coverage across various domains and multiple languages.
  • User Alignment: Provides better alignment with user preferences for subjective and open-ended tasks, leading to more helpful and higher-quality text generation.
  • Extended Context: Supports a native context length of 262,144 tokens, enabling deep understanding and generation for very long inputs.
  • Non-Thinking Mode: Operates exclusively in a "non-thinking mode," simplifying usage by not generating <think></think> blocks.

Performance Highlights

The base Qwen3-4B-Instruct-2507 model shows strong performance across various benchmarks, often outperforming its predecessor and other models in its class. Notably, it achieves high scores in:

  • Knowledge: MMLU-Pro (69.6), MMLU-Redux (84.2), GPQA (62.0).
  • Reasoning: AIME25 (47.4), HMMT25 (31.0), ZebraLogic (80.2).
  • Coding: MultiPL-E (76.8).
  • Alignment: Creative Writing v3 (83.5), WritingBench (83.4).

Good for

  • Applications requiring less restrictive content generation and reduced refusals.
  • Tasks benefiting from extended context understanding up to 262K tokens.
  • Use cases demanding strong performance in instruction following, logical reasoning, and coding.
  • Scenarios where multilingual support and broad knowledge coverage are crucial.