sarvamai/sarvam-m

Warm
Public
24B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Sarvam-M: A Hybrid-Reasoning Multilingual LLM

Sarvam-M is a 24 billion parameter language model developed by SarvamAI, built upon the Mistral-Small architecture. It is specifically post-trained to excel in multilingual contexts, particularly Indian languages, while also demonstrating strong hybrid reasoning capabilities. The model supports a substantial 32768-token context length.

Key Capabilities

  • Hybrid Thinking Mode: Features distinct "think" and "non-think" modes. The "think" mode is optimized for complex logical reasoning, mathematical problems, and coding tasks, while the "non-think" mode handles efficient, general-purpose conversations.
  • Advanced Indic Skills: Achieves a +20% average improvement on Indian language benchmarks and provides full support for both Indic scripts and romanized versions of Indian languages, reflecting Indian cultural values.
  • Superior Reasoning: Demonstrates significant performance gains, including a +21.6% enhancement on math benchmarks and a +17.6% boost on programming benchmarks. Notably, it shows an +86% improvement in romanized Indian language GSM-8K benchmarks.

Good For

  • Applications requiring robust multilingual support, especially for Indian languages.
  • Tasks demanding complex logical reasoning, such as mathematical problem-solving and code generation.
  • Developing conversational AI that can seamlessly switch between analytical and general interaction modes.