TheDrummer/Gemma-3-R1-12B-v1
TheDrummer/Gemma-3-R1-12B-v1 is a 12 billion parameter Gemma 3 R1 model developed by TheDrummer, featuring a 32768 token context length. This model is specifically tuned for enhanced reasoning capabilities and exhibits less positive bias in its responses. It is designed to be vision-capable, expanding its utility beyond text-only applications.
Loading preview...
Overview
TheDrummer/Gemma-3-R1-12B-v1 is a 12 billion parameter language model based on the Gemma 3 R1 architecture. Developed by TheDrummer, this model has been specifically tuned to improve its reasoning abilities and to produce less overtly positive outputs. A notable feature is its intended vision capability, suggesting potential for multimodal applications.
Key Capabilities
- Enhanced Reasoning: The model has undergone specific tuning to unlock more advanced reasoning capabilities.
- Adjusted Tone: Designed to exhibit less positivity in its responses, offering a more neutral or varied output tone.
- Vision Capable: The model is expected to support vision inputs, enabling multimodal use cases.
Usage Notes
Users may need to prefill <think> at the beginning of the assistant's turn to guide its responses. The model's design allows for creative modifications of reasoning tags, such as <evil_think> or <creative_think>, as <think> is not a special token.