VINAY-UMRETHE/Qwen3-0.6B-heretic-OG is a 0.8 billion parameter causal language model, based on the Qwen3 architecture, with a 32,768 token context length. This model is a decensored version of Qwen/Qwen3-0.6B, created using the Heretic v1.2.0 tool, significantly reducing refusals compared to the original. It retains Qwen3's unique ability to seamlessly switch between 'thinking' and 'non-thinking' modes for complex reasoning or efficient general dialogue. This model is optimized for enhanced reasoning, instruction-following, and agent capabilities, making it suitable for applications requiring less restrictive content generation.
Loading preview...
VINAY-UMRETHE/Qwen3-0.6B-heretic-OG: Decensored Qwen3 Model
This model is a decensored version of the Qwen/Qwen3-0.6B, created using the Heretic v1.2.0 tool. It significantly reduces content refusals, demonstrating 31 refusals out of 100 compared to the original model's 59/100. Based on the Qwen3 architecture, this 0.8 billion parameter causal language model features a 32,768 token context length.
Key Capabilities & Features
- Decensored Output: Offers less restrictive content generation compared to the base Qwen3 model.
- Dual Thinking Modes: Uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient, general-purpose dialogue. This can be controlled via
enable_thinkingparameter or/thinkand/no_thinktags in prompts. - Enhanced Reasoning: Retains Qwen3's advancements in mathematics, code generation, and commonsense logical reasoning.
- Superior Human Preference Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following.
- Agent Capabilities: Demonstrates strong tool-calling abilities, integrating with external tools in both thinking and non-thinking modes.
- Multilingual Support: Supports over 100 languages and dialects with robust multilingual instruction following and translation.
Usage Recommendations
- Sampling Parameters: Specific
Temperature,TopP,TopK, andMinPsettings are recommended for optimal performance in both thinking and non-thinking modes. Greedy decoding is discouraged for thinking mode. - Output Length: Recommend 32,768 tokens for most queries, up to 38,912 for complex problems.
- Standardized Output: Use specific prompts for math problems (e.g., "Please reason step by step, and put your final answer within \boxed{}") and multiple-choice questions (e.g., JSON structure for answers).
- Agentic Use: Recommended with Qwen-Agent for best tool-calling performance.