lzdev/Qwen3-4B-Instruct-2507-heretic
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 12, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

lzdev/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter instruction-tuned causal language model, based on Qwen's Qwen3-4B-Instruct-2507, with a native context length of 262,144 tokens. This specific version has been decensored using the Heretic v1.2.0 tool, resulting in significantly reduced refusal rates compared to the original model. It is optimized for general capabilities including instruction following, logical reasoning, and long-context understanding, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Model Overview

lzdev/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter instruction-tuned causal language model, derived from Qwen's Qwen3-4B-Instruct-2507. This version has been specifically processed with the Heretic v1.2.0 tool to create a decensored variant, significantly reducing content refusal rates from 100/100 in the original to 22/100 in this model. It features a substantial native context length of 262,144 tokens and operates in a "non-thinking mode," meaning it does not generate <think></think> blocks.

Key Capabilities

  • Decensored Output: Achieves a refusal rate of 22/100, compared to 100/100 for the original model.
  • Enhanced General Abilities: Demonstrates significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
  • Long-Context Understanding: Supports a native context length of 262,144 tokens, with enhanced capabilities in 256K long-context scenarios.
  • Multilingual Knowledge: Offers substantial gains in long-tail knowledge coverage across multiple languages.
  • User Alignment: Provides markedly better alignment with user preferences for subjective and open-ended tasks, leading to more helpful responses.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its base model and other comparably sized models:

  • Knowledge: Achieves 69.6 on MMLU-Pro and 62.0 on GPQA.
  • Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
  • Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
  • Alignment: Scores 43.4 on Arena-Hard v2 and 83.5 on Creative Writing v3.
  • Agentic Use: Excels in tool calling capabilities, with strong scores on BFCL-v3 (61.9) and TAU1-Retail (48.7).

Use Cases

This model is particularly well-suited for applications requiring:

  • Less Restrictive Content Generation: Ideal for scenarios where the original model's censorship might be too limiting.
  • Complex Instruction Following: Benefits from improved instruction adherence and logical reasoning.
  • Long Document Analysis: Its extensive context window makes it suitable for processing and generating content based on very long texts.
  • Agentic Workflows: Strong tool-calling capabilities make it effective for integrating with external tools and systems.