eekay/gemma-2b-it-noised-np0.25-attn-emb

TEXT GENERATIONConcurrency Cost:1Model Size:2.5BQuant:BF16Ctx Length:8kPublished:Apr 28, 2026Architecture:Transformer Cold

The eekay/gemma-2b-it-noised-np0.25-attn-emb is a 2.5 billion parameter instruction-tuned language model based on the Gemma architecture. This model incorporates noise (np0.25) and modifications to its attention and embedding layers, suggesting an experimental or specialized fine-tuning approach. Its primary differentiator lies in these architectural and training modifications, potentially optimizing it for specific robustness or performance characteristics under certain conditions. With an 8192-token context length, it is suitable for tasks requiring moderate input and output lengths.

Loading preview...

Overview

The eekay/gemma-2b-it-noised-np0.25-attn-emb is a 2.5 billion parameter instruction-tuned model built upon the Gemma architecture. This model distinguishes itself through specific modifications during its training, including the introduction of noise (np0.25) and alterations to its attention and embedding mechanisms. These changes suggest an experimental approach aimed at exploring the impact of such modifications on model performance or robustness.

Key Characteristics

  • Architecture: Based on the Gemma family of models.
  • Parameter Count: 2.5 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports an 8192-token context window, enabling processing of moderately long inputs and generating comprehensive responses.
  • Unique Training: Incorporates noise (np0.25) and modified attention/embedding layers, indicating a focus on specialized fine-tuning rather than a standard instruction-tuning process.

Potential Use Cases

Given the experimental nature of its training, this model could be particularly interesting for:

  • Research and Development: Exploring the effects of noise injection and architectural modifications on LLM performance and generalization.
  • Specific Robustness Tasks: Potentially suitable for applications where models need to perform well under noisy or slightly perturbed input conditions, depending on the exact nature of the 'noised' training.
  • Instruction-Following: As an instruction-tuned model, it is designed to follow user prompts and generate relevant outputs for a variety of general language tasks.