tvall43/Qwen3.5-4B-heretic-v2
The tvall43/Qwen3.5-4B-heretic-v2 is a 4.5 billion parameter decensored version of the Qwen3.5-4B model, featuring a 32,768 token context length. This model has been modified using Heretic v1.2.0 to reduce refusals significantly while maintaining a low KL divergence from the original. It is designed for applications requiring less restrictive content generation, building upon the multimodal and efficient architecture of the Qwen3.5 series.
Loading preview...
What is tvall43/Qwen3.5-4B-heretic-v2?
This model is a 4.5 billion parameter, decensored variant of the Qwen3.5-4B, created using Heretic v1.2.0. It maintains the advanced multimodal capabilities of the original Qwen3.5 series, including a unified vision-language foundation, efficient hybrid architecture, and scalable RL generalization. A key differentiator is its significantly reduced refusal rate (4/100 compared to 99/100 for the original) with a minimal KL divergence of 0.0483, making it suitable for use cases where less restrictive content generation is desired.
Key Capabilities
- Decensored Output: Achieves a substantially lower refusal rate compared to the base model.
- Multimodal Understanding: Supports unified vision-language processing, excelling in reasoning, coding, agent tasks, and visual understanding.
- Efficient Architecture: Utilizes Gated Delta Networks and sparse Mixture-of-Experts for high-throughput inference.
- Extensive Context Length: Natively supports up to 262,144 tokens, extensible to over 1 million tokens with YaRN scaling.
- Multilingual Support: Expanded coverage to 201 languages and dialects.
Should I use this for my use case?
- Good for: Applications requiring a less restrictive or 'decensored' model output, particularly in multimodal contexts. Ideal for tasks where the original Qwen3.5-4B might exhibit excessive refusals. Its strong performance across language, vision, and agent benchmarks, combined with its reduced refusal rate, makes it suitable for diverse applications including complex reasoning, code generation, and general conversational AI where content filtering is managed externally or is less critical.
- Consider alternatives if: Your application strictly requires adherence to strong content moderation policies, or if the slight increase in KL divergence from the original Qwen3.5-4B is a critical concern for your specific task.