google/gemma-2b-it
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Feb 8, 2024License:gemmaArchitecture:Transformer0.9K Gated Warm

Gemma-2b-it is a 2.6 billion parameter instruction-tuned decoder-only language model developed by Google. Built from the same research and technology as the Gemini models, it is designed for a variety of English text generation tasks including question answering, summarization, and reasoning. Its lightweight architecture allows for deployment in resource-limited environments like laptops or desktops, democratizing access to advanced AI capabilities.

Loading preview...

Model Overview

Gemma-2b-it is a 2.6 billion parameter instruction-tuned model from Google's Gemma family, derived from the same research as the Gemini models. This lightweight, open-weights, English-language model is a text-to-text, decoder-only large language model. Its relatively small size makes it suitable for deployment in environments with limited resources, such as laptops, desktops, or personal cloud infrastructure.

Key Capabilities

  • Text Generation: Excels at various text generation tasks, including question answering, summarization, and reasoning.
  • Resource Efficiency: Designed for deployment in resource-constrained settings due to its compact size.
  • Instruction Following: Instruction-tuned for conversational use, adhering to a specific chat template for optimal performance.
  • Fine-tuning Support: Provides scripts and guidance for supervised fine-tuning (SFT) using techniques like QLoRA and FSDP.

Intended Use Cases

  • Content Creation: Generating creative text formats, marketing copy, or email drafts.
  • Conversational AI: Powering chatbots, virtual assistants, and interactive applications.
  • Research & Education: Serving as a foundation for NLP research, language learning tools, and knowledge exploration.

Training Details

The model was trained on a diverse dataset totaling 6 trillion tokens, including web documents, code, and mathematical texts. Data preprocessing involved rigorous CSAM filtering, sensitive data filtering, and quality/safety filtering. Training was conducted on Google's TPUv5e hardware using JAX and ML Pathways, emphasizing performance, memory efficiency, and scalability.