prithivMLmods/Gliese-Qwen3.5-27B-Abliterated-Caption

VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Mar 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gliese-Qwen3.5-27B-Abliterated-Caption is a 27 billion parameter vision-language model developed by prithivMLmods, built upon Qwen3.5-27B. It is specifically designed for generalized and unfiltered image captioning, utilizing advanced refusal direction analysis and abliterated training to minimize internal refusal behaviors. This model excels at generating highly detailed, long-form, and context-aware descriptions of scenes, objects, and environments, making it suitable for dataset generation and multimodal research.

Loading preview...

Gliese-Qwen3.5-27B-Abliterated-Caption Overview

Gliese-Qwen3.5-27B-Abliterated-Caption is a 27 billion parameter vision-language model developed by prithivMLmods, an evolution of Qwen3.5-27B. Its core innovation lies in its "abliterated" training strategy, which incorporates advanced refusal direction analysis to significantly reduce internal refusal behaviors. This allows the model to generate generalized and unfiltered image captions with exceptional detail and depth of visual understanding.

Key Capabilities

  • Unfiltered and Detailed Caption Generation: Produces comprehensive visual descriptions without excessive refusal, offering rich context-aware insights into scenes, objects, and environments.
  • Optimized Visual Understanding: Enhanced to provide high-fidelity, long-form, and semantically detailed captions.
  • 27B Parameter Architecture: Built on Qwen3.5-27B, offering stronger multimodal reasoning and improved caption quality.
  • Mitigated Refusal Behaviors: Utilizes targeted activation analysis to identify and lessen refusal directions within the model's latent space.

Intended Use Cases

  • High-Detail Image Captioning: Generating extremely descriptive captions for various images.
  • Dataset Generation: Creating large-scale caption datasets for multimodal training and research.
  • Vision-Language Research: Studying multimodal reasoning and captioning behaviors, particularly in contexts requiring unfiltered outputs.
  • Annotation Automation: Assisting in automatic labeling and visual description tasks.
  • Local Multimodal AI Deployment: Suitable for running powerful captioning models on local GPUs for development workflows.

Limitations & Risks

It's important to note that this model intentionally reduces built-in refusal mechanisms, which means it may produce unfiltered outputs, including explicit or controversial captions depending on the input images. Users are responsible for handling generated content ethically and lawfully.