ibm-granite/granite-3.1-2b-instruct
Granite-3.1-2B-Instruct is a 2 billion parameter instruction-tuned decoder-only transformer model developed by the Granite Team at IBM. Fine-tuned from Granite-3.1-2B-Base, it excels at long-context tasks, including summarization, question-answering, and RAG, with a context length of 32768 tokens. The model supports 12 languages and is designed for building AI assistants across various domains, leveraging techniques like supervised finetuning and reinforcement learning.
Loading preview...
Overview
Granite-3.1-2B-Instruct is a 2 billion parameter instruction-tuned language model developed by the Granite Team at IBM. It is fine-tuned from the Granite-3.1-2B-Base model using a combination of open-source instruction datasets and internal synthetic datasets specifically designed for long-context problems. The model incorporates supervised finetuning, reinforcement learning for alignment, and model merging techniques, and is built on a decoder-only dense transformer architecture featuring GQA, RoPE, SwiGLU MLP, and RMSNorm.
Key Capabilities
- Long-Context Processing: Optimized for tasks requiring extensive context, such as long document summarization and question-answering.
- Multilingual Support: Capable of handling dialog in English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
- General Instruction Following: Designed to respond to a wide range of instructions for building AI assistants.
- Diverse NLP Tasks: Proficient in summarization, text classification, text extraction, question-answering, Retrieval Augmented Generation (RAG), and code-related tasks.
Intended Use Cases
This model is suitable for developers looking to build AI assistants for various business applications. Its strengths lie in:
- Summarizing lengthy documents or meeting transcripts.
- Extracting specific information from text.
- Answering questions based on provided context.
- Implementing RAG systems for enhanced information retrieval.
- Handling multilingual dialog scenarios.
- Performing code-related tasks and function-calling.