prithivMLmods/Qwen3-VL-2B-Instruct-abliterated-v1

VISIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Oct 22, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Gated Cold

prithivMLmods/Qwen3-VL-2B-Instruct-abliterated-v1 is an abliterated variant of the Qwen3-VL-2B-Instruct model, designed for uncensored reasoning and captioning. This multimodal model generates detailed, descriptive captions and reasoning outputs across diverse visual contexts, including sensitive content, while supporting various aspect ratios and resolutions. It is optimized to bypass conventional content filters, providing high-fidelity descriptions for general, artistic, technical, abstract, or low-context images. The model's primary strength lies in its ability to produce comprehensive and unrestricted visual analyses.

Loading preview...

Model Overview

prithivMLmods/Qwen3-VL-2B-Instruct-abliterated-v1 is an "abliterated" (v1.0) variant of the Qwen3-VL-2B-Instruct model, specifically engineered for uncensored reasoning and captioning. Built upon the robust multimodal reasoning and instruction-following capabilities of its base architecture, this model focuses on generating detailed and descriptive outputs across a wide spectrum of visual and multimodal content.

Key Capabilities

  • Abliterated / Uncensored Captioning: Fine-tuned to bypass conventional content filters, providing factual, descriptive, and reasoning-rich outputs for sensitive or nuanced content.
  • High-Fidelity Descriptions: Generates comprehensive captions and reasoning for general, artistic, technical, abstract, or low-context images.
  • Robust Across Aspect Ratios: Maintains consistent performance across wide, tall, square, and irregular image dimensions.
  • Variational Detail Control: Capable of producing outputs ranging from concise summaries to intricate, fine-grained descriptions and reasoning.
  • Multilingual Adaptability: Primarily optimized for English, with potential for multilingual prompts through engineering.

Intended Use Cases

This model is particularly suited for:

  • Generating detailed, uncensored captions and reasoning for general-purpose or artistic datasets.
  • Research in content moderation, red-teaming, and generative safety evaluation.
  • Enabling descriptive captioning and reasoning for visual datasets typically excluded from mainstream models.
  • Creative applications such as storytelling, art generation, or multimodal reasoning tasks.
  • Captioning and reasoning for non-standard aspect ratios and stylized visual content.

Limitations

Users should be aware that this model may produce explicit, sensitive, or offensive descriptions depending on the image content and prompts. It is not recommended for production systems requiring strict content moderation, and accuracy can fluctuate for unfamiliar, synthetic, or highly abstract visual content.