google/gemma-2-2b

Warm
Public
2.6B
BF16
8192
License: gemma
Hugging Face
Gated
Overview

What is Gemma 2 2B?

Gemma 2 2B is a 2.6 billion parameter, decoder-only large language model developed by Google. It is part of the Gemma family, which leverages the same research and technology as the Gemini models. This model is designed to be lightweight and efficient, making it suitable for deployment in environments with limited computational resources, such as laptops, desktops, or private cloud infrastructure.

Key Capabilities

  • Text Generation: Proficient in generating various forms of text, including creative content, code, and marketing copy.
  • Question Answering: Capable of providing answers to a wide range of queries.
  • Summarization: Can condense longer texts into concise summaries.
  • Reasoning: Demonstrates capabilities in logical reasoning tasks.
  • English Language Support: Primarily designed for English-language applications.

Training and Performance

The 2B model was trained on a diverse dataset of 2 trillion tokens, encompassing web documents, code, and mathematical texts to enhance its linguistic understanding, programming logic, and problem-solving abilities. It was trained using Google's latest Tensor Processing Unit (TPUv5p) hardware, leveraging JAX and ML Pathways for efficient and scalable training. Benchmarks show competitive performance for its size across various tasks, including MMLU, HellaSwag, and HumanEval.

When to Use This Model

This model is ideal for developers looking for a powerful yet compact language model that can be deployed efficiently on consumer-grade hardware or within constrained cloud environments. Its versatility makes it suitable for applications requiring text generation, conversational AI, summarization, and educational tools, particularly where resource efficiency is a priority.