google/gemma-2-9b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Jun 24, 2024License:gemmaArchitecture:Transformer0.7K Gated Warm

Gemma 2 9B is a 9 billion parameter, decoder-only large language model developed by Google, built from the same research and technology as the Gemini models. It is a text-to-text model available in English, designed for a variety of text generation tasks including question answering, summarization, and reasoning. Its relatively small size allows for deployment in resource-limited environments, democratizing access to state-of-the-art AI.

Loading preview...

Overview

Google's Gemma 2 9B is a 9 billion parameter, decoder-only large language model, part of the Gemma family derived from the same research as the Gemini models. It is designed for text-to-text generation in English, offering open weights for both pre-trained and instruction-tuned variants. The model is optimized for deployment in environments with limited resources, such as laptops or desktops, making advanced AI more accessible.

Key Capabilities

  • Text Generation: Capable of generating various text formats, including creative content, code, and email drafts.
  • Conversational AI: Suitable for powering chatbots, virtual assistants, and interactive applications.
  • Text Summarization: Can produce concise summaries of documents, research papers, or reports.
  • Reasoning: Excels in tasks requiring logical reasoning and understanding, as evidenced by strong performance on benchmarks like MMLU (71.3%) and GSM8K (68.6%).

Training and Performance

The 9B model was trained on 8 trillion tokens, including diverse web documents, code, and mathematical texts. It leverages Google's TPUv5p hardware and JAX/ML Pathways software for efficient training. Benchmarks show competitive performance across various tasks, including a 40.2% pass@1 on HumanEval for code generation and 36.6% on MATH. The model incorporates rigorous data cleaning, including CSAM and sensitive data filtering, and has undergone extensive ethics and safety evaluations.

Intended Usage

This model is well-suited for content creation, conversational AI, and research in NLP. Its design prioritizes Responsible AI development, offering a high-performance open LLM for a wide range of applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p