VINAY-UMRETHE/Qwen3-0.6B-heretic-REPRODUCE
VINAY-UMRETHE/Qwen3-0.6B-heretic-REPRODUCE is a 0.6 billion parameter causal language model, based on the Qwen3 architecture, that has been decensored using the Heretic v1.2.0 tool. This model features a 32,768 token context length and uniquely supports seamless switching between a 'thinking mode' for complex reasoning and a 'non-thinking mode' for efficient general dialogue. It is optimized for enhanced reasoning, instruction-following, and agent capabilities, particularly excelling in scenarios requiring reduced refusal rates compared to its original counterpart.
Loading preview...
VINAY-UMRETHE/Qwen3-0.6B-heretic-REPRODUCE Overview
This model is a decensored version of the Qwen3-0.6B causal language model, processed with Heretic v1.2.0. It retains the core Qwen3 architecture, featuring 0.6 billion parameters and a substantial 32,768 token context length. A key differentiator is its significantly reduced refusal rate, demonstrating 6 refusals out of 100 compared to 55/100 in the original Qwen3-0.6B, achieved through specific abliteration parameters.
Key Capabilities & Features
- Dual-Mode Operation: Seamlessly switches between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient, general-purpose dialogue.
- Enhanced Reasoning: Demonstrates improved performance in mathematics, code generation, and commonsense logical reasoning, surpassing previous Qwen models.
- Superior Human Preference Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following.
- Agent Capabilities: Supports precise integration with external tools in both thinking and non-thinking modes, achieving leading performance in complex agent-based tasks.
- Multilingual Support: Capable of handling over 100 languages and dialects with strong multilingual instruction following and translation abilities.
Usage & Best Practices
Users can control the thinking mode via enable_thinking parameters in tokenizer.apply_chat_template or dynamically within prompts using /think and /no_think tags. Optimal sampling parameters are recommended for each mode to prevent performance degradation and repetitions. The model also supports agentic use with tools like Qwen-Agent for tool calling.