Name: p-e-w/Qwen3-4B-Instruct-2507-heretic-REPRODUCTION-TEST-2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: p-e-w

Model Overview

This model, p-e-w/Qwen3-4B-Instruct-2507-heretic-REPRODUCTION-TEST-2, is a decensored variant of the original Qwen/Qwen3-4B-Instruct-2507, processed with Heretic v1.2.0. It is a 4 billion parameter causal language model from the Qwen3 family, featuring a remarkable 262,144 native token context length.

Key Capabilities & Enhancements

Decensored Output: Compared to the original, this version shows a significant reduction in refusals (14/100 vs. 100/100), indicating a less restrictive response generation.
General Performance: Offers substantial improvements across various domains including instruction following, logical reasoning, text comprehension, mathematics, science, and coding.
Long-Context Understanding: Excels in processing and understanding very long inputs, supporting up to 256K tokens.
User Alignment: Demonstrates better alignment with user preferences for subjective and open-ended tasks, leading to more helpful and higher-quality text generation.
Tool Usage: Features enhanced capabilities in tool calling, with recommendations to use Qwen-Agent for optimal integration.

Performance Highlights

This model shows strong performance across several benchmarks, often outperforming its base model and even larger models in specific categories:

Knowledge: Achieves 69.6 on MMLU-Pro and 84.2 on MMLU-Redux.
Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
Coding: Reaches 35.1 on LiveCodeBench v6.
Alignment: Scores 43.4 on Arena-Hard v2 and 83.5 on Creative Writing v3.

Usage Considerations

The model operates in a "non-thinking mode" and does not generate <think></think> blocks.
Recommended sampling parameters include Temperature=0.7, TopP=0.8, TopK=20, and MinP=0 for optimal performance.
Supports deployment with sglang and vllm for OpenAI-compatible API endpoints, and is compatible with local applications like Ollama and llama.cpp.

Overview

Model Overview

Key Capabilities & Enhancements

Performance Highlights

Usage Considerations

Full Model Card (README)