prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX

VISIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 14, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX is a 4 billion parameter vision-language model built upon the Qwen3-VL-4B-Thinking architecture. This model is specifically fine-tuned using advanced abliterated training strategies to minimize internal refusal behaviors, enabling unrestricted multimodal reasoning and detailed captioning across complex visual inputs. It excels at generating high-fidelity, descriptive outputs for diverse content, while retaining dynamic resolution support and a 32768 token context length.

Loading preview...

Overview

Qwen3-VL-4B-Thinking-Unredacted-MAX is a 4 billion parameter vision-language model developed by prithivMLmods, evolving from the Qwen3-VL-4B-Thinking architecture. Its primary differentiator is the application of advanced "abliterated training strategies" to significantly reduce internal refusal behaviors and improve instruction adherence. This results in a highly capable model optimized for unrestricted, detailed reasoning and captioning across various visual inputs.

Key Capabilities

  • Unredacted MAX Training: Minimizes refusal patterns, allowing for more direct and comprehensive responses to diverse prompts.
  • Unrestricted Multimodal Reasoning: Designed for deep analysis of artistic, forensic, technical, or abstract visual content without standard safety-driven refusals.
  • High-Fidelity Captions: Generates dense, descriptive outputs suitable for dataset generation, metadata enrichment, or accessibility.
  • Dynamic Resolution Support: Processes varying image resolutions and aspect ratios effectively.
  • Efficient Architecture: At 4B parameters, it balances strong reasoning performance with more efficient hardware requirements compared to larger models.

Intended Use Cases

  • Advanced Red-Teaming: Evaluating multimodal robustness and probing behavioral edge cases.
  • Complex Data Archiving: Generating detailed captions for medical, artistic, historical, or research datasets.
  • Refusal Mechanism Research: Studying behavioral shifts in vision-language models after abliterated fine-tuning.
  • Creative Storytelling: Producing detailed visual descriptions for narrative and world-building projects.

Limitations & Risks

Users should be aware that this model is designed to minimize built-in refusal mechanisms. This means it may generate explicit or controversial descriptions if prompted, and outputs must be handled responsibly within ethical and legal boundaries. Adequate VRAM is required for high-resolution image processing and longer generations.