cmhacks/Qwen3-0.6B-hereticed
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

cmhacks/Qwen3-0.6B-hereticed is a 0.8 billion parameter causal language model, a decensored version of Qwen/Qwen3-0.6B created using Heretic v1.2.0. It features a 32,768 token context length and is specifically modified to reduce refusals compared to its original counterpart. This model is designed for general-purpose dialogue and instruction following, with enhanced flexibility in content generation due to its decensored nature.

Loading preview...

Model Overview

cmhacks/Qwen3-0.6B-hereticed is a 0.8 billion parameter causal language model, derived from the Qwen3-0.6B base model. This version has been specifically decensored using Heretic v1.2.0, resulting in a significant reduction in refusals (3/100) compared to the original model (53/100), while maintaining a low KL divergence of 0.0034. It supports a substantial context length of 32,768 tokens.

Key Capabilities & Features

  • Decensored Content Generation: Modified to produce responses with fewer refusals, offering greater flexibility in output.
  • Dual Thinking Modes: Inherits Qwen3's unique ability to seamlessly switch between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient, general-purpose dialogue. This can be controlled via enable_thinking parameter or soft switches (/think, /no_think) in prompts.
  • Enhanced Reasoning: The base Qwen3 model shows significant improvements in mathematics, code generation, and commonsense logical reasoning.
  • Superior Human Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following.
  • Agentic Capabilities: Demonstrates strong tool-calling abilities, integrating with external tools in both thinking and unthinking modes.
  • Multilingual Support: Supports over 100 languages and dialects for instruction following and translation.

Use Cases

This model is particularly well-suited for applications requiring:

  • Flexible Content Generation: Where the original model's refusal rates might be restrictive.
  • Complex Problem Solving: Leveraging its thinking mode for tasks involving logical reasoning, mathematics, and code generation.
  • Engaging Conversational AI: For creative writing, role-playing, and multi-turn dialogues.
  • Agent-based Systems: Utilizing its tool-calling capabilities for integration with external functions.

For optimal performance, specific sampling parameters are recommended for thinking and non-thinking modes, and an adequate output length of 32,768 tokens is suggested for most queries.