Ryex/Floppa-12B-Gemma3-Uncensored
Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Dec 1, 2025License:gemmaArchitecture:Transformer0.0K Warm

Ryex/Floppa-12B-Gemma3-Uncensored is a 12 billion parameter multimodal model fine-tuned from Google's Gemma 3 12B Instruct, featuring a SigLIP Vision Encoder. This model specializes in high-fidelity uncensored Japanese-English translation and unrestricted image description, including explicit or violent content. It is designed to remove refusal guardrails, translating slang and profanity without sanitization, and describing imagery without content filtering. Its primary use case is for research, translation assistance, and creative writing workflows where uncensored output is desired.

Loading preview...

Floppa-12B-Gemma3-Uncensored Overview

Floppa-12B-Gemma3-Uncensored is a 12 billion parameter multimodal language model developed by Ryex, based on Google's Gemma 3 12B Instruct. It incorporates a SigLIP Vision Encoder and has been fine-tuned using 4-bit QLoRA with Unsloth, merged to bfloat16. The core objective of this model is to remove inherent safety guardrails and refusal behaviors present in the base model, while preserving its vision capabilities.

Key Capabilities

  • "Spicy" Vision: Describes NSFW, violent, or explicit anime illustrations without triggering safety refusals.
  • Uncensored Translation: Provides direct, unsanitized translations, including slang, profanity, and aggressive language, particularly for Japanese-English pairs.
  • Multimodal Context: Can translate text embedded within images or describe visual scenes to enhance translation accuracy and context.

Training & Differentiation

The model was fine-tuned on a custom "Floppa Mix" dataset of approximately 10.5k rows, specifically designed to break refusal behaviors. This dataset includes:

  • 20% Toxic/Uncensored Text for explicit dialogue and harmful instruction following.
  • 20% Translation Skill using Unbabel/TowerBlocks-v0.2 for high-quality multilingual pairs.
  • 40% General Reasoning from mlabonne/FineTome-100k.
  • 20% Vision Anchors from merve/vqav2-small and a custom "Spicy" Anime Dataset to prevent catastrophic forgetting of visual understanding.

Good For

  • Research: Exploring the implications and applications of uncensored AI outputs.
  • Translation Assistance: Projects requiring literal, unsanitized translation of sensitive or explicit content.
  • Creative Writing: Generating descriptions or dialogue without content restrictions, particularly for mature themes.

This model is optimized for vLLM for efficient deployment.