Name: Roman0/Qwen3-4B-Instruct-2507-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Roman0

Overview

Roman0/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter instruction-tuned causal language model, derived from Qwen/Qwen3-4B-Instruct-2507. This version has been decensored using the Heretic v1.1.0 tool, resulting in a significantly lower refusal rate of 4/100 compared to the original model's 100/100 refusals. It maintains the original model's impressive 262,144-token native context length.

Key Capabilities

Decensored Output: Offers less restrictive content generation compared to the base model, with a drastically reduced refusal rate.
Enhanced General Capabilities: Shows significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, and coding.
Long-Context Understanding: Supports a native context length of 262,144 tokens, enabling processing of very long inputs.
Multilingual Support: Features substantial gains in long-tail knowledge coverage across multiple languages.
Agentic Tool Usage: Excels in tool calling capabilities, recommended for use with Qwen-Agent for optimal performance.
Subjective Task Alignment: Markedly better alignment with user preferences in subjective and open-ended tasks, leading to more helpful responses.

Performance Highlights

Compared to the original Qwen3-4B-Instruct-2507, this Heretic version demonstrates a KL divergence of 0.1596, indicating a controlled modification. The base model itself shows strong performance across various benchmarks, often outperforming larger models like GPT-4.1-nano-2025-04-14 in categories such as Knowledge (MMLU-Pro: 69.6), Reasoning (AIME25: 47.4, ZebraLogic: 80.2), Coding (MultiPL-E: 76.8), and Alignment (Creative Writing v3: 83.5).

Good For

Applications requiring a less restrictive or decensored language model.
Tasks benefiting from extremely long context windows (up to 262K tokens).
Use cases demanding strong instruction following, logical reasoning, and mathematical abilities.
Code generation and agentic workflows leveraging tool-use capabilities.
Generating high-quality text for subjective and open-ended tasks.