Overview
Gemma 2 27B IT is a 27 billion parameter instruction-tuned model from Google, part of the Gemma family of lightweight, open models. It is a text-to-text, decoder-only large language model, available in English with open weights. The model was trained on a diverse dataset of 13 trillion tokens, including web documents, code, and mathematical text, to enhance its capabilities across various domains.
Key Capabilities
- Text Generation: Excels at generating creative text formats, code, marketing copy, and email drafts.
- Conversational AI: Suitable for powering chatbots, virtual assistants, and interactive applications.
- Text Summarization: Can generate concise summaries of documents, research papers, and reports.
- Reasoning: Demonstrates strong performance in reasoning tasks, as evidenced by benchmarks like MMLU (75.2) and GSM8K (74.0).
- Code Generation: Achieves a pass@1 score of 51.8 on HumanEval and 62.6 on MBPP, indicating proficiency in code-related tasks.
Training and Infrastructure
The model was trained on Google's latest generation [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p) using [JAX][jax] and [ML Pathways][ml-pathways]. This infrastructure provides significant computational power, memory, and scalability, contributing to the model's performance and efficiency.
Ethics and Safety
Google has implemented rigorous CSAM and sensitive data filtering during data preprocessing. The model underwent extensive ethics and safety evaluations, including red-teaming and benchmarks like RealToxicity and BBQ, to ensure adherence to internal policies and responsible AI development.