TheDrummer/Gemma-3-R1-12B-v1

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Aug 11, 2025Architecture:Transformer0.0K Warm

TheDrummer/Gemma-3-R1-12B-v1 is a 12 billion parameter Gemma 3 R1 model developed by TheDrummer, featuring a 32768 token context length. This model is specifically tuned for enhanced reasoning capabilities and exhibits less positive bias in its responses. It is designed to be vision-capable, expanding its utility beyond text-only applications.

Loading preview...

Overview

TheDrummer/Gemma-3-R1-12B-v1 is a 12 billion parameter language model based on the Gemma 3 R1 architecture. Developed by TheDrummer, this model has been specifically tuned to improve its reasoning abilities and to produce less overtly positive outputs. A notable feature is its intended vision capability, suggesting potential for multimodal applications.

Key Capabilities

  • Enhanced Reasoning: The model has undergone specific tuning to unlock more advanced reasoning capabilities.
  • Adjusted Tone: Designed to exhibit less positivity in its responses, offering a more neutral or varied output tone.
  • Vision Capable: The model is expected to support vision inputs, enabling multimodal use cases.

Usage Notes

Users may need to prefill <think> at the beginning of the assistant's turn to guide its responses. The model's design allows for creative modifications of reasoning tags, such as <evil_think> or <creative_think>, as <think> is not a special token.