Name: MuXodious/Qwen3.5-4B-SOMPOA-heresy-v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: MuXodious

Model Overview

MuXodious/Qwen3.5-4B-SOMPOA-heresy-v2 is a 4 billion parameter Qwen3.5-4B fine-tune, created using P-E-W's Heretic engine with Self-Organizing Maps & Magnitude-Preserving Orthogonal Ablation (SOMPOA). This process significantly reduces model refusals (from 103/104 to 4/104) while maintaining a low KL Divergence of 0.0565. The base Qwen3.5 model, developed by the Qwen Team, is a causal language model with a vision encoder, featuring a native context length of 262,144 tokens, extensible up to 1,010,000 tokens via YaRN scaling.

Key Capabilities

Multimodal Learning: Achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks, supporting text, image, and video inputs.
Efficient Hybrid Architecture: Incorporates Gated Delta Networks and sparse Mixture-of-Experts for high-throughput inference with minimal latency.
Scalable RL Generalization: Benefits from reinforcement learning scaled across million-agent environments for robust real-world adaptability.
Global Linguistic Coverage: Supports 201 languages and dialects, enabling inclusive worldwide deployment.
Agentic Usage & Tool Calling: Excels in tool calling, with recommended integration via Qwen-Agent and Qwen Code for terminal-based AI agent applications.
Ultra-Long Context: Natively handles up to 262,144 tokens and supports extension to 1,010,000 tokens using YaRN scaling techniques.

What makes THIS different from other models?

This model stands out due to its unique "heretication" process using SOMPOA, which drastically reduces refusal rates while preserving core capabilities. It combines the advanced multimodal and long-context features of the Qwen3.5 base model with a specialized fine-tuning approach focused on reducing undesirable outputs. As of July 2026, it boasts the 2nd best UGI score among models 20B and under.

Should I use this for my use case?

Use this model if: You require a multimodal model with strong performance in reasoning, coding, and visual understanding, particularly if you need to minimize refusals in generated content. Its long context window and agentic capabilities make it suitable for complex tasks, tool integration, and processing extensive documents or media. The SOMPOA fine-tuning makes it a strong candidate for applications where controlled output and reduced refusal rates are critical.
Consider alternatives if: Your primary concern is raw benchmark performance without specific emphasis on refusal reduction, or if your application does not require multimodal input or ultra-long context processing.

Overview

Model Overview

Key Capabilities

What makes THIS different from other models?

Should I use this for my use case?

Full Model Card (README)