ibm-granite/granite-3.1-8b-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Dec 6, 2024License:apache-2.0Architecture:Transformer0.2K Open Weights Cold

Granite-3.1-8B-Instruct is an 8 billion parameter instruction-tuned causal language model developed by IBM. Fine-tuned from Granite-3.1-8B-Base, it features a 32768 token context length and is optimized for long-context tasks, including summarization, question-answering, and Retrieval Augmented Generation (RAG). The model supports multilingual dialog in 12 languages and is designed for building AI assistants in various domains.

Loading preview...

Overview

Granite-3.1-8B-Instruct is an 8 billion parameter instruction-tuned model developed by IBM, building upon the Granite-3.1-8B-Base architecture. It incorporates supervised finetuning, reinforcement learning for alignment, and model merging techniques. Designed with a structured chat format, this model excels in handling long-context problems, making it suitable for complex business applications.

Key Capabilities

  • Long-Context Processing: Optimized for tasks requiring extensive context, such as long document summarization and question-answering over large texts.
  • Multilingual Support: Capable of multilingual dialog in 12 languages, including English, German, Spanish, French, and Japanese.
  • Diverse Task Performance: Proficient in summarization, text classification, text extraction, question-answering, Retrieval Augmented Generation (RAG), code-related tasks, and function-calling.
  • Robust Architecture: Based on a decoder-only dense transformer with GQA, RoPE, SwiGLU MLP, RMSNorm, and shared input/output embeddings.

Performance Highlights

On the HuggingFace Open LLM Leaderboard V1, Granite-3.1-8B-Instruct achieved an average score of 71.31, with notable scores of 65.34 on MMLU and 73.84 on GSM8K. For the V2 leaderboard, it scored an average of 30.55, including 72.08 on IFEval and 34.09 on BBH.

Intended Use Cases

This model is ideal for developers looking to build AI assistants that require strong instruction following, handle extensive textual inputs, and operate across multiple languages. It is particularly well-suited for enterprise solutions needing reliable performance in areas like document analysis and automated customer support.