Model Overview

Krutrim-2-instruct is a 12 billion parameter instruction-tuned language model from OLA Krutrim, based on the Mistral-NeMo 12B architecture. It was pretrained on a wide array of data, including web content, code, math, Indic languages, and Indian context data, then fine-tuned for instruction following across various tasks like knowledge recall, math, reasoning, coding, safety, and creative writing. The model further underwent Direct Preference Optimization (DPO) to enhance helpfulness, safety, and reasoning alignment.

Key Capabilities

12B parameter dense transformer model offering improved generalization.
128K token context window suitable for long conversations and document processing.
Natively multilingual, providing best-in-class performance on Indic benchmarks.
Strong Indian cultural context relevance, outperforming many models in manual evaluations.
Competitive performance on English benchmarks and HumanEval coding tasks, often matching or exceeding larger models (5-10x its size) for multilingual Indic generation.
Achieves top-3 performance on 5 out of 7 tasks in BharatBench, a natively Indic benchmark.

Use Cases

This model is particularly well-suited for applications requiring strong performance in:

Multilingual generation, especially in Indic languages.
Long-form content generation and multi-turn conversations due to its large context window.
Tasks requiring Indian cultural context and linguistic nuances.
Coding assistance and general instruction following.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)