eekay/gemma-2b-it-noised-np0.1-attn-emb-s2

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Jun 16, 2026Architecture:Transformer Cold

The eekay/gemma-2b-it-noised-np0.1-attn-emb-s2 model is a 2 billion parameter instruction-tuned variant of the Gemma architecture. This model incorporates noise during training, specifically with a noise probability of 0.1, and features modifications to its attention embeddings. It is designed for general language understanding and generation tasks, leveraging a substantial context length of 32768 tokens.

Loading preview...

Model Overview

The eekay/gemma-2b-it-noised-np0.1-attn-emb-s2 is a 2 billion parameter instruction-tuned model based on the Gemma architecture. This model incorporates specific training modifications, including the introduction of noise with a probability of 0.1 and adjustments to its attention embeddings. It is designed to handle a significant amount of input, supporting a context length of 32768 tokens.

Key Characteristics

  • Architecture: Gemma-based, a compact yet powerful foundation for language tasks.
  • Parameter Count: 2 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Features an extended context window of 32768 tokens, enabling the processing of longer inputs and maintaining conversational coherence over extended interactions.
  • Training Modifications: Includes "noised" training (np0.1) and altered attention embeddings (attn-emb-s2), suggesting experimental approaches to enhance robustness or performance.

Potential Use Cases

Given its instruction-tuned nature and substantial context window, this model is suitable for:

  • General-purpose conversational AI: Engaging in extended dialogues and understanding complex prompts.
  • Text generation: Creating coherent and contextually relevant long-form content.
  • Instruction following: Executing a wide range of tasks based on explicit instructions.

Further details regarding its specific training data, evaluation metrics, and intended applications are not provided in the current model card.