allura-org/Gemma-3-Glitter-4B

Warm
Public
Vision
4.3B
BF16
32768
1
Mar 26, 2025
Hugging Face
Overview

Overview

The allura-org/Gemma-3-Glitter-4B is a 4.3 billion parameter language model built upon the Gemma 3 architecture. It distinguishes itself by employing the identical data mix used for the larger Glitter 12B model, aiming to leverage that successful training methodology within a more compact parameter count. With a substantial context window of 32768 tokens, it is capable of processing and generating longer sequences of text, making it suitable for tasks requiring extensive contextual understanding.

Key Capabilities

  • Gemma 3 Architecture: Benefits from the foundational design and optimizations of the Gemma 3 series.
  • Shared Data Mix: Utilizes the same proven data mix as the Glitter 12B model, suggesting a focus on broad language tasks.
  • Extended Context Length: Supports a 32768-token context window, enabling handling of longer documents and conversations.

Good For

  • Applications requiring a capable language model with a moderate parameter count.
  • Tasks benefiting from a large context window, such as summarization of long texts or extended dialogue generation.
  • Developers seeking a model with a training lineage similar to a larger, established variant but in a more resource-efficient size.