eekay/gemma-2b-it-noised-np0.1-attn-emb-s42

TEXT GENERATIONConcurrency Cost:1Model Size:2.5BQuant:BF16Ctx Length:8kPublished:Jun 18, 2026Architecture:Transformer Cold

The eekay/gemma-2b-it-noised-np0.1-attn-emb-s42 is a 2.5 billion parameter instruction-tuned language model based on the Gemma architecture. This model incorporates noise during its training, specifically with a noise probability of 0.1, and features modifications to its attention embeddings. It is designed for general language understanding and generation tasks, building upon the foundational capabilities of the Gemma family.

Loading preview...

Model Overview

The eekay/gemma-2b-it-noised-np0.1-attn-emb-s42 is a 2.5 billion parameter instruction-tuned model derived from the Gemma architecture. While specific details on its development and training data are not provided in the model card, its naming convention suggests a focus on exploring the effects of noise injection during training and modifications to attention embeddings.

Key Characteristics

  • Base Architecture: Gemma, a family of lightweight, state-of-the-art open models from Google.
  • Parameter Count: 2.5 billion parameters, making it suitable for applications requiring a balance between performance and computational efficiency.
  • Context Length: Supports an 8192-token context window, allowing for processing longer inputs and generating more coherent, extended outputs.
  • Training Modifications: The model name indicates specific training interventions, including a noise probability of 0.1 (np0.1) and adjustments to attention embeddings (attn-emb-s42). These modifications likely aim to enhance robustness or explore different performance characteristics compared to standard Gemma models.

Potential Use Cases

Given its instruction-tuned nature and moderate size, this model could be suitable for:

  • General-purpose text generation and completion.
  • Instruction following for various NLP tasks.
  • Experimentation with models trained under specific noise conditions.
  • Applications where a smaller, efficient Gemma-based model with potentially enhanced robustness is desired.