ibm-granite/granite-7b-instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 19, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The ibm-granite/granite-7b-instruct is a 7 billion parameter instruction-tuned causal language model developed by IBM Research. It is a derivative of the Granite-7b-base model, aligned using the novel Large-scale Alignment for chatBots (LAB) methodology with Mixtral-8x7B-Instruct as a teacher model. This model is designed to incrementally add new knowledge and skills without catastrophic forgetting, making it suitable for diverse conversational AI applications.

Loading preview...

ibm-granite/granite-7b-instruct: An IBM Research Model

Granite-7b-instruct is a 7 billion parameter instruction-tuned model developed by IBM Research, based on the Granite-7b-base architecture. It leverages the novel Large-scale Alignment for chatBots (LAB) methodology for alignment, utilizing Mixtral-8x7B-Instruct as its teacher model.

Key Capabilities & Methodology

The LAB method is designed to enhance LLMs through a synthetic data-based alignment tuning process, comprising three core components:

  • Taxonomy-driven data curation: This approach uses a tree of seed examples to guide the sampling process during synthetic data generation, ensuring diverse knowledge domains and skills are covered.
  • Large-scale synthetic data generation: Unlike uniform sampling, LAB samples locally within leaf nodes of the taxonomy, allowing the teacher model to better exploit task distributions.
  • Two-phased training with replay buffers: This training strategy enables the model to incrementally acquire new knowledge and skills without suffering from catastrophic forgetting.

This methodology allows Granite-7b-instruct to incorporate new domain-specific knowledge and foundational skills like reasoning, as well as compositional skills such as creative writing.

Performance & Limitations

While the model demonstrates competitive performance against other 7B and 13B models in its class, achieving an MTBench score of 6.69 and MMLU of 51.91, it is important to note that Granite-7b-instruct is a base model and has not undergone explicit safety alignment. Users should implement appropriate safeguards as it may produce problematic outputs and is susceptible to hallucination, particularly in ungrounded generation scenarios.