allura-org/Gemma-3-Glitter-12B

Warm
Public
Vision
12B
FP8
32768
Hugging Face
Overview

Overview

Gemma-3-Glitter-12B is a 12 billion parameter model developed by allura-org, built upon the Gemma 3 IT architecture. Its primary focus is on creative writing, distinguishing it from general-purpose LLMs. The model is a unique blend, resulting from a 50/50 merge of two specialized training datasets.

Key Capabilities

  • Creative Writing: Optimized for generating long-form narrative content, leveraging approximately 20 million tokens of completion training on creative writing.
  • Roleplay (RP) Scenarios: Incorporates around 13.5 million tokens of instruct-based training specifically for roleplay, including examples with system prompts.
  • Vision Support: Notably, this model has re-integrated vision capabilities, allowing for multimodal creative applications.
  • Gemma2/3 Instruct Format: Utilizes the standard Gemma2/3 instruct format but has been further trained to recognize and effectively use an optional system role.

Good For

  • Developers and writers seeking a model specialized in generating creative narratives, stories, and descriptive text.
  • Applications requiring roleplay-oriented responses with structured system prompts.
  • Use cases where vision input can enhance creative content generation.