ansulev/Qwen3.5-4B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING
The ansulev/Qwen3.5-4B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING model is a 4.5 billion parameter Qwen 3.5 dense model, fine-tuned by ansulev using four Claude-4.6-OS datasets. This multimodal model supports vision (images and video) and text inputs, and is specifically designed to be uncensored and highly compliant with user instructions. It demonstrates improved reasoning and output generation, surpassing the base Qwen3.5-4B-Instruct model in various benchmarks, and features enhanced tool handling capabilities.
Loading preview...
What the fuck is this model about?
This model, ansulev/Qwen3.5-4B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING, is a 4.5 billion parameter variant of the Qwen 3.5 base model, fine-tuned by ansulev. It leverages four Claude-4.6-OS datasets to enhance its reasoning and output generation, aiming to exceed the performance of the root model across benchmarks. A key characteristic is its "HERETIC" nature, meaning it is fully uncensored and designed to follow user instructions without refusal.
What makes THIS different from all the other models?
This model stands out due to its explicit uncensored nature and its fine-tuning with Claude-4.6-OS datasets, which has demonstrably improved its reasoning and output quality compared to the original Qwen3.5-4B-Instruct. It also features an upgraded Jinja template to address common issues like repetitions and long thinking in the original model, alongside improved tool handling. Benchmarks show significant improvements in tasks like ARC, BoolQ, and OBQA over the base Qwen3.5-4B-Instruct.
Key Capabilities:
- Multimodal: Supports both image and video inputs, with vision capabilities tested and working.
- Uncensored Output: Designed to provide direct responses without safety alignment refusals (4/100 refusals vs. 94/100 in the original model).
- Enhanced Reasoning: Fine-tuning on diverse Claude datasets improves logical processing and response quality.
- Improved Tool Handling: Features upgraded tool handling and a repaired Jinja template to prevent common LLM issues.
- Long Context: Natively supports a 32,768 token context length, extensible up to 1,010,000 tokens with YaRN scaling.
Should I use this for my use case?
This model is particularly suited for use cases requiring uncensored responses and high instruction compliance. If your application demands a model that will execute requests "no questions asked" and benefits from strong reasoning, especially in a multimodal context (text, images, video), this model is a strong candidate. It's also a good choice if you need a model with improved tool-calling capabilities and a reduced tendency for repetitive or overly verbose outputs. Consider its use for creative applications, specialized research, or scenarios where strict content filtering is undesirable. For optimal performance, it's recommended to use specific sampling parameters as suggested in the README, and to consider the minimum quantization requirements (q4ks or IQ3S).