Name: georgehenney/Qwen3-VL-4B-Instruct-heretic-7refusal API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: georgehenney

Model Overview

This model, georgehenney/Qwen3-VL-4B-Instruct-heretic-7refusal, is a modified version of the Qwen3-VL-4B-Instruct, a 4 billion parameter vision-language model developed by Qwen. The primary modification, performed using the Heretic v1.0.1 tool, significantly reduces the model's refusal rate from 92/100 to 7/100, making it a "decensored" variant.

Key Capabilities

Vision-Language Integration: Offers comprehensive upgrades in text understanding, generation, visual perception, and reasoning.
Reduced Refusals: Engineered to provide responses with fewer content restrictions compared to its base model.
Visual Agent: Capable of operating PC/mobile GUIs, recognizing elements, understanding functions, and completing tasks.
Advanced Spatial Perception: Judges object positions, viewpoints, and occlusions, supporting 2D and 3D grounding for spatial reasoning.
Long Context & Video Understanding: Features a native 256K context, expandable to 1M, enabling it to handle extensive text and hours-long video content with full recall.
Enhanced Multimodal Reasoning: Excels in STEM/Math tasks, providing causal analysis and logical, evidence-based answers.
Upgraded Visual Recognition: Trained on broader, higher-quality data to recognize a wide array of entities including celebrities, products, and landmarks.
Expanded OCR: Supports 32 languages and is robust in challenging conditions (low light, blur, tilt), with improved handling of rare characters and document structures.

Use Cases

This model is particularly suited for applications requiring a powerful vision-language understanding with a preference for fewer content restrictions. It can be leveraged for:

Automated UI interaction and task completion via its Visual Agent capabilities.
Complex multimodal reasoning in scientific and mathematical domains.
Detailed image and video analysis over long durations.
Multilingual OCR and document processing.
Creative content generation where broader response flexibility is desired.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)