Name: ansulev/Qwen3.5-4B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ansulev

Model Overview

This model, ansulev/Qwen3.5-4B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING, is a 4.5 billion parameter fine-tuned version of the Qwen 3.5 dense model. It was trained by ansulev using four Claude datasets, aiming to improve reasoning and output generation while maintaining strong benchmarks. A key differentiator is its "HERETIC" and "uncensored" nature, meaning it is designed to respond without refusal or safety alignment constraints present in many other models.

Key Capabilities

Enhanced Reasoning & Output: Fine-tuning on multiple Claude datasets has improved the model's reasoning abilities and generation quality, outperforming the base Qwen 3.5 model on various benchmarks.
Multimodal Support: The model supports vision input (images) and has been tested to work with new training, with video understanding capabilities also present in the base Qwen3.5 architecture.
Uncensored Responses: Designed as a "HERETIC" model, it provides uncensored output, making it suitable for use cases where typical safety alignments are undesirable.
Extended Context Length: Natively supports a context length of 262,144 tokens, extensible up to 1,010,000 tokens using YaRN scaling techniques.
Tool Handling: Features an upgraded Jinja template to improve tool handling and address issues like repetitions and long thinking loops found in the original model.

What makes THIS different from all the other models?

This model stands out due to its explicit uncensored and "HERETIC" training, which means it will generate responses without the typical safety alignments or refusals found in most LLMs. This is coupled with a fine-tuning process that leverages multiple Claude datasets to boost reasoning and output quality beyond the base Qwen 3.5 model, as evidenced by improved benchmark scores (e.g., higher ARC and BoolQ scores compared to the original Qwen3.5-4B-Instruct). Additionally, it includes an upgraded Jinja template specifically designed to fix common issues like repetitive outputs and long thinking loops, enhancing its practical usability for developers.

Should I use this for my use case?

You should consider this model if:

Your application requires uncensored and unfiltered responses without safety alignments or refusals.
You need a model with strong reasoning and generation capabilities in the 4-5 billion parameter range, especially if you value performance improvements over the base Qwen 3.5 model.
Your use case involves multimodal inputs, particularly images, as vision capabilities are confirmed to be working.
You require a model with extended context handling for processing very long texts or complex multi-turn conversations.
You are developing applications that benefit from improved tool handling and reduced issues like repetition or excessive internal thinking.

You might reconsider if:

Your application requires strict safety guidelines and content moderation, as this model is explicitly designed to be uncensored.
You need video understanding capabilities, as the video portions were not tested in this specific fine-tune, though the base model supports it.

Overview

Model Overview

Key Capabilities

What makes THIS different from all the other models?

Should I use this for my use case?

Full Model Card (README)