Granite-3.3-8B-Instruct: Enhanced Reasoning and Instruction Following

Granite-3.3-8B-Instruct is an 8-billion parameter language model from IBM, designed for improved reasoning and instruction-following. Building on Granite-3.3-8B-Base, this model features a substantial 128K context length and demonstrates significant performance gains across various benchmarks, including AlpacaEval-2.0 and Arena-Hard. It also shows notable improvements in specialized areas such as mathematics and coding.

Key Capabilities

Structured Reasoning: Utilizes <think> and <response> tags to clearly separate internal thought processes from final outputs, enhancing clarity and control.
Multilingual Support: Supports English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, and Chinese, with potential for fine-tuning in additional languages.
Broad Task Proficiency: Excels in summarization, text classification, extraction, question-answering, RAG, code-related tasks, function-calling, and long-context applications.
Mathematical Prowess: Achieves 8.12 on AIME24 and 69.02 on MATH-500, indicating strong mathematical reasoning capabilities.

Good For

General Instruction Following: Designed to handle a wide array of instruction-based tasks.
AI Assistants: Suitable for integration into AI assistants across diverse domains, including business applications.
Complex Reasoning: Its structured reasoning capabilities make it effective for tasks requiring detailed thought processes.
Long Document Processing: The 128K context length is beneficial for tasks like long document summarization and QA.

Overview

Granite-3.3-8B-Instruct: Enhanced Reasoning and Instruction Following

Key Capabilities

Good For

Full Model Card (README)