Overview

Granite-3.1-8B-Instruct is an 8 billion parameter instruction-tuned model developed by IBM, building upon the Granite-3.1-8B-Base architecture. It incorporates supervised finetuning, reinforcement learning for alignment, and model merging techniques. Designed with a structured chat format, this model excels in handling long-context problems, making it suitable for complex business applications.

Key Capabilities

Long-Context Processing: Optimized for tasks requiring extensive context, such as long document summarization and question-answering over large texts.
Multilingual Support: Capable of multilingual dialog in 12 languages, including English, German, Spanish, French, and Japanese.
Diverse Task Performance: Proficient in summarization, text classification, text extraction, question-answering, Retrieval Augmented Generation (RAG), code-related tasks, and function-calling.
Robust Architecture: Based on a decoder-only dense transformer with GQA, RoPE, SwiGLU MLP, RMSNorm, and shared input/output embeddings.

Performance Highlights

On the HuggingFace Open LLM Leaderboard V1, Granite-3.1-8B-Instruct achieved an average score of 71.31, with notable scores of 65.34 on MMLU and 73.84 on GSM8K. For the V2 leaderboard, it scored an average of 30.55, including 72.08 on IFEval and 34.09 on BBH.

Intended Use Cases

This model is ideal for developers looking to build AI assistants that require strong instruction following, handle extensive textual inputs, and operate across multiple languages. It is particularly well-suited for enterprise solutions needing reliable performance in areas like document analysis and automated customer support.

Overview

Overview

Key Capabilities

Performance Highlights

Intended Use Cases

Full Model Card (README)