Overview

Granite-3.1-2B-Instruct is a 2 billion parameter instruction-tuned language model developed by the Granite Team at IBM. It is fine-tuned from the Granite-3.1-2B-Base model using a combination of open-source instruction datasets and internal synthetic datasets specifically designed for long-context problems. The model incorporates supervised finetuning, reinforcement learning for alignment, and model merging techniques, and is built on a decoder-only dense transformer architecture featuring GQA, RoPE, SwiGLU MLP, and RMSNorm.

Key Capabilities

Long-Context Processing: Optimized for tasks requiring extensive context, such as long document summarization and question-answering.
Multilingual Support: Capable of handling dialog in English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
General Instruction Following: Designed to respond to a wide range of instructions for building AI assistants.
Diverse NLP Tasks: Proficient in summarization, text classification, text extraction, question-answering, Retrieval Augmented Generation (RAG), and code-related tasks.

Intended Use Cases

This model is suitable for developers looking to build AI assistants for various business applications. Its strengths lie in:

Summarizing lengthy documents or meeting transcripts.
Extracting specific information from text.
Answering questions based on provided context.
Implementing RAG systems for enhanced information retrieval.
Handling multilingual dialog scenarios.
Performing code-related tasks and function-calling.

Overview

Overview

Key Capabilities

Intended Use Cases

Full Model Card (README)