Name: VINAY-UMRETHE/Qwen3-0.6B-heretic-Base2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: VINAY-UMRETHE

VINAY-UMRETHE/Qwen3-0.6B-heretic-Base2 Overview

This model is a decensored version of the Qwen/Qwen3-0.6B, developed using the Heretic v1.3.0 tool. It retains the core architecture of the Qwen3 series, featuring 0.6 billion parameters (0.44B non-embedding) and a context length of 32,768 tokens.

Key Differentiators & Capabilities

Decensored Output: Significantly reduces refusals, with only 6 refusals out of 100 compared to 55/100 in the original Qwen/Qwen3-0.6B, making it suitable for use cases requiring less content filtering.
Dynamic Thinking Modes: Uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose dialogue. This can be controlled via enable_thinking parameter or dynamic /think and /no_think tags in user prompts.
Enhanced Reasoning: In thinking mode, it shows significant improvements in mathematical, code generation, and commonsense logical reasoning tasks.
Agent Capabilities: Excels in tool-calling and integration with external tools, performing well in complex agent-based tasks, especially when used with Qwen-Agent.
Multilingual Support: Supports over 100 languages and dialects with strong multilingual instruction following and translation capabilities.

Performance & Best Practices

While maintaining a low KL divergence of 0.0139 from the original model, its primary distinction is the reduced refusal rate. For optimal performance, specific sampling parameters are recommended for each mode: Temperature=0.6, TopP=0.95, TopK=20 for thinking mode, and Temperature=0.7, TopP=0.8, TopK=20 for non-thinking mode. It is advised to use an adequate output length of 32,768 tokens for most queries.

Overview

VINAY-UMRETHE/Qwen3-0.6B-heretic-Base2 Overview

Key Differentiators & Capabilities

Performance & Best Practices

Full Model Card (README)