Name: lzdev/Qwen3-4B-Instruct-2507-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lzdev

Model Overview

lzdev/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter instruction-tuned causal language model, derived from Qwen's Qwen3-4B-Instruct-2507. This version has been specifically processed with the Heretic v1.2.0 tool to create a decensored variant, significantly reducing content refusal rates from 100/100 in the original to 22/100 in this model. It features a substantial native context length of 262,144 tokens and operates in a "non-thinking mode," meaning it does not generate <think></think> blocks.

Key Capabilities

Decensored Output: Achieves a refusal rate of 22/100, compared to 100/100 for the original model.
Enhanced General Abilities: Demonstrates significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
Long-Context Understanding: Supports a native context length of 262,144 tokens, with enhanced capabilities in 256K long-context scenarios.
Multilingual Knowledge: Offers substantial gains in long-tail knowledge coverage across multiple languages.
User Alignment: Provides markedly better alignment with user preferences for subjective and open-ended tasks, leading to more helpful responses.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its base model and other comparably sized models:

Knowledge: Achieves 69.6 on MMLU-Pro and 62.0 on GPQA.
Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
Alignment: Scores 43.4 on Arena-Hard v2 and 83.5 on Creative Writing v3.
Agentic Use: Excels in tool calling capabilities, with strong scores on BFCL-v3 (61.9) and TAU1-Retail (48.7).

Use Cases

This model is particularly well-suited for applications requiring:

Less Restrictive Content Generation: Ideal for scenarios where the original model's censorship might be too limiting.
Complex Instruction Following: Benefits from improved instruction adherence and logical reasoning.
Long Document Analysis: Its extensive context window makes it suitable for processing and generating content based on very long texts.
Agentic Workflows: Strong tool-calling capabilities make it effective for integrating with external tools and systems.