Name: coder3101/Qwen3-VL-4B-Instruct-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: coder3101

Overview

This model, coder3101/Qwen3-VL-4B-Instruct-heretic, is a 4 billion parameter vision-language model based on Qwen's Qwen3-VL-4B-Instruct. Its primary distinction is the application of the Heretic v1.0.1 abliteration process, which significantly reduces refusal rates compared to the original model. While the original Qwen3-VL-4B-Instruct exhibited 91 refusals out of 100, this 'heretic' version shows only 3 refusals out of 100, as measured by KL divergence of 0.47.

Key Capabilities

This model inherits the comprehensive upgrades of the Qwen3-VL series, offering advanced multimodal functionalities:

Visual Agent: Capable of operating PC/mobile GUIs, recognizing elements, understanding functions, and completing tasks.
Visual Coding Boost: Generates Draw.io/HTML/CSS/JS from images or videos.
Advanced Spatial Perception: Judges object positions, viewpoints, and occlusions, enabling 2D and 3D grounding for spatial reasoning.
Long Context & Video Understanding: Features a native 256K context, expandable to 1M, for handling extensive text and hours-long video with full recall.
Enhanced Multimodal Reasoning: Excels in STEM/Math tasks, providing causal analysis and logical, evidence-based answers.
Upgraded Visual Recognition: Broad and high-quality pretraining allows recognition of a wide array of entities, including celebrities, products, and landmarks.
Expanded OCR: Supports 32 languages, with robust performance in challenging conditions and improved long-document structure parsing.
Text Understanding: Offers seamless text-vision fusion for unified comprehension on par with pure LLMs.

When to Use This Model

This model is particularly suited for use cases where the original Qwen3-VL-4B-Instruct's refusal behavior is undesirable. Developers needing a powerful vision-language model with strong multimodal capabilities, but requiring less constrained or more direct responses, will find this 'heretic' version beneficial. It is ideal for applications demanding robust visual understanding, complex reasoning, and reduced content filtering.

Overview

Overview

Key Capabilities

When to Use This Model

Full Model Card (README)