prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX

VISIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 14, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Qwen3-VL-4B-Instruct-Unredacted-MAX is a 4 billion parameter vision-language model developed by prithivMLmods, built upon the Qwen3-VL-4B-Instruct architecture. This model is specifically fine-tuned using advanced abliterated training strategies to minimize internal refusal behaviors, enabling unrestricted, detailed reasoning and captioning across complex visual inputs. It excels at generating high-fidelity, descriptive outputs for diverse visual content, making it suitable for applications requiring deep analysis without standard safety-driven refusals. The model retains dynamic resolution support for varying image inputs.

Loading preview...

Overview

prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX is a 4 billion parameter vision-language model, an "unredacted" evolution of the Qwen3-VL-4B-Instruct architecture. It has been fine-tuned with advanced abliterated training strategies to significantly reduce refusal patterns and improve instruction adherence, particularly for prompts that might trigger standard safety-driven refusals in other models. This model balances strong reasoning performance with more efficient hardware requirements compared to larger 8B variants.

Key Capabilities

  • Unredacted MAX Training: Minimizes internal refusal behaviors, allowing for unrestricted multimodal reasoning.
  • 4B Parameter Architecture: Offers a balance of performance and efficiency.
  • Unrestricted Multimodal Reasoning: Designed for deep analysis of artistic, forensic, technical, or abstract visual content.
  • High-Fidelity Captions: Produces dense, descriptive outputs suitable for detailed analysis and data generation.
  • Dynamic Resolution Support: Processes varying image resolutions and aspect ratios effectively.

Intended Use Cases

  • Advanced Red-Teaming: For evaluating multimodal robustness and probing behavioral edge cases.
  • Complex Data Archiving: Generating detailed captions for specialized datasets (medical, artistic, historical, research).
  • Refusal Mechanism Research: Studying behavioral shifts in vision-language models post-abliterated fine-tuning.
  • Creative Storytelling: Producing detailed visual descriptions for narrative and world-building projects.

Limitations & Risks

This model is explicitly designed to minimize built-in refusal mechanisms. Users should be aware that it may generate explicit or controversial descriptions if prompted accordingly, and generated outputs must be handled responsibly within ethical and legal boundaries. While more efficient than 8B models, it still requires adequate VRAM for high-resolution image processing and longer generations.