Model Overview
mhenrichsen/gemma-2b is a 2.6 billion parameter base model from Google's Gemma family, derived from the same research and technology as the Gemini models. This text-to-text, decoder-only LLM is available with open weights and is designed for general text generation tasks in English. Its relatively small size is a key differentiator, enabling deployment on devices with limited resources such as laptops, desktops, or personal cloud infrastructure.
Key Capabilities
- Text Generation: Capable of generating creative text formats, code, and email drafts.
- Question Answering & Summarization: Well-suited for extracting information and condensing text.
- Reasoning: Supports tasks requiring logical inference.
- Fine-tuning: Provides scripts and notebooks for supervised fine-tuning (SFT) using QLoRA or FSDP, adaptable for custom datasets.
- Deployment Flexibility: Optimized for efficient execution on CPUs, single/multi-GPUs, and supports various precisions (float16, bfloat16) and quantization (8-bit, 4-bit) via
bitsandbytes for reduced memory footprint.
Training and Evaluation
The model was trained on a diverse 6 trillion token dataset comprising web documents, code, and mathematical texts, with rigorous CSAM and sensitive data filtering. It was developed using Google's TPUv5e hardware, JAX, and ML Pathways for efficient and scalable training. Benchmarks show competitive performance for its size, with an average score of 54.0 across various tasks including MMLU, HellaSwag, and HumanEval. Ethical considerations, including bias, misinformation, and privacy, were addressed through extensive evaluation and mitigation strategies, aligning with Google's Responsible AI principles.
Intended Usage
This model is ideal for developers and researchers looking to implement LLM capabilities in resource-constrained environments or for applications requiring efficient text generation, conversational AI, and knowledge exploration. Its open nature fosters innovation and accessibility in the AI ecosystem.