GanitLLM-0.6B_SFT_GRPO is a 0.6 billion parameter causal language model developed by dipta007, built upon the Qwen3-0.6B base architecture. This model is specifically fine-tuned for Bengali mathematical reasoning, utilizing Supervised Fine-Tuning (SFT) and standard GRPO. It demonstrates significant accuracy improvements on Bengali mathematical benchmarks and generates more concise, Bengali-centric reasoning compared to its base model. Designed for resource-constrained environments, it offers a 4,096 token context length and supports both Bengali and English languages.
Loading preview...
GanitLLM-0.6B_SFT_GRPO: Bengali Mathematical Reasoning Model
GanitLLM-0.6B_SFT_GRPO is a 0.6 billion parameter causal language model developed by dipta007, optimized for Bengali mathematical reasoning. Built on the Qwen3-0.6B base, this model was trained using a two-stage pipeline: Supervised Fine-Tuning (SFT) on the GANIT-SFT dataset, followed by standard GRPO (reinforcement learning with random sampling) on GANIT-RLVR. It incorporates specific reward functions for format validation, correctness (Bengali and English answers), and ensuring a high percentage of Bengali text in reasoning.
Key Capabilities & Performance
- Enhanced Bengali Mathematical Reasoning: Achieves +24.0 accuracy on Bn-MGSM (from 8.4 to 32.4) and +40.3 accuracy on Bn-MSVAMP (from 12.2 to 52.5) compared to the base Qwen3-0.6B model.
- High Bengali Reasoning Percentage: Demonstrates 88.45% Bengali reasoning, a substantial increase from the base model's 12.43%.
- Concise Solutions: Generates solutions with 80.6% fewer tokens (averaging 246 words vs. 1265 words for the base model).
- Resource-Efficient: Designed for deployment in resource-constrained environments.
Ideal Use Cases
This model is particularly well-suited for applications requiring accurate and efficient mathematical problem-solving in Bengali, especially where computational resources are limited. Its ability to provide concise, Bengali-focused reasoning makes it valuable for educational tools, localized AI assistants, and research in multilingual NLP for mathematical tasks.