MergeBench/gemma-2-9b-it_instruction

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:May 14, 2025License:mitArchitecture:Transformer Open Weights Warm

MergeBench/gemma-2-9b-it_instruction is a 9 billion parameter instruction-tuned language model, part of the Gemma 2 family, developed in conjunction with the MergeBench project. This model is specifically designed for evaluating the merging of domain-specialized Large Language Models, as detailed in its associated research paper. With a context length of 16384 tokens, it serves as a foundational component for research into model merging techniques and their performance implications.

Loading preview...

MergeBench/gemma-2-9b-it_instruction Overview

This model, MergeBench/gemma-2-9b-it_instruction, is a 9 billion parameter instruction-tuned language model. It is developed as part of the MergeBench project, an initiative focused on benchmarking the merging of domain-specialized Large Language Models. The model's architecture and training are geared towards facilitating research and evaluation in this specific area.

Key Capabilities

  • Instruction-tuned: Designed to follow instructions effectively, making it suitable for various NLP tasks.
  • Research-oriented: Primarily intended for use within the MergeBench framework to study model merging.
  • Context Length: Supports a substantial context window of 16384 tokens, allowing for processing longer inputs and maintaining conversational coherence.

Good For

  • Evaluating Model Merging: Its primary utility lies in serving as a base model for experiments related to merging different LLMs, as outlined in the associated research paper: MergeBench: A Benchmark for Merging Domain-Specialized LLMs.
  • Academic Research: Ideal for researchers and academics exploring advanced topics in LLM architecture, fine-tuning, and combination strategies.
  • Instruction Following Tasks: Can be used for general instruction-following applications, though its core purpose is research into model merging.