Overview
Sarvam-M: A Hybrid-Reasoning Multilingual LLM
Sarvam-M is a 24 billion parameter language model developed by SarvamAI, built upon the Mistral-Small architecture. It is specifically post-trained to excel in multilingual contexts, particularly Indian languages, while also demonstrating strong hybrid reasoning capabilities. The model supports a substantial 32768-token context length.
Key Capabilities
- Hybrid Thinking Mode: Features distinct "think" and "non-think" modes. The "think" mode is optimized for complex logical reasoning, mathematical problems, and coding tasks, while the "non-think" mode handles efficient, general-purpose conversations.
- Advanced Indic Skills: Achieves a +20% average improvement on Indian language benchmarks and provides full support for both Indic scripts and romanized versions of Indian languages, reflecting Indian cultural values.
- Superior Reasoning: Demonstrates significant performance gains, including a +21.6% enhancement on math benchmarks and a +17.6% boost on programming benchmarks. Notably, it shows an +86% improvement in romanized Indian language GSM-8K benchmarks.
Good For
- Applications requiring robust multilingual support, especially for Indian languages.
- Tasks demanding complex logical reasoning, such as mathematical problem-solving and code generation.
- Developing conversational AI that can seamlessly switch between analytical and general interaction modes.