Name: tvall43/Qwen3.5-4B-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tvall43

Model Overview

tvall43/Qwen3.5-4B-heretic is a 4.5 billion parameter multimodal causal language model, derived from the Qwen3.5 architecture. It is a decensored variant of unsloth/Qwen3.5-4B, processed using Heretic v1.2.0. The model supports a native context length of 32,768 tokens, extensible up to 1,010,000 tokens using RoPE scaling techniques like YaRN.

Key Differentiators

Decensored Output: Significantly reduces refusals, with 8/100 refusals compared to 99/100 in the original model, making it suitable for less restricted content generation.
Multimodal Capabilities: Features a unified vision-language foundation, enabling early fusion training on multimodal tokens for cross-generational parity with Qwen3 and Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.
Efficient Architecture: Incorporates Gated Delta Networks and sparse Mixture-of-Experts for high-throughput inference with minimal latency.
Scalable RL Generalization: Utilizes reinforcement learning scaled across million-agent environments for robust real-world adaptability.
Global Linguistic Coverage: Expanded support for 201 languages and dialects.

Performance Highlights

While the base Qwen3.5-4B model demonstrates strong performance across various benchmarks including MMLU-Pro (79.1), C-Eval (85.1), and instruction following tasks, the 'heretic' version's primary distinction lies in its reduced refusal rate. It also shows competitive scores in vision-language tasks such as MMMU (77.6) and MathVision (74.6).

Recommended Use Cases

Applications requiring less restrictive content: Ideal for scenarios where the original model's refusal rates are too high.
Multimodal tasks: Capable of handling combined text, image, and video inputs for summarization, question answering, and agentic workflows.
Long context processing: Supports extended context lengths for complex documents and conversations.
Agentic applications: Excels in tool calling, with recommendations for use with Qwen-Agent and Qwen Code for building AI agent applications.

Overview

Model Overview

Key Differentiators

Performance Highlights

Recommended Use Cases

Full Model Card (README)