Name: dipta007/GanitLLM-0.6B_SFT_CGRPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: dipta007

GanitLLM-0.6B_SFT_CGRPO: Bengali Mathematical Reasoning Model

GanitLLM-0.6B_SFT_CGRPO is a 0.6 billion parameter causal language model developed by dipta007, built upon the Qwen3-0.6B architecture. It is specifically designed for mathematical reasoning in Bengali, making it ideal for applications requiring strong performance in this domain, especially in resource-constrained environments.

Key Capabilities and Differentiators

Enhanced Bengali Mathematical Reasoning: Achieves significant improvements in accuracy on Bengali mathematical benchmarks, including a +20.0 accuracy increase on Bn-MGSM (from 8.4 to 28.4) and a +40.2 accuracy increase on Bn-MSVAMP (from 12.2 to 52.4) compared to its base model.
Efficient and Bengali-Centric Solutions: Demonstrates 88.50% Bengali reasoning in its outputs and generates solutions with 80.4% fewer tokens (averaging 248 words vs. 1265 words for the base model), indicating more concise and relevant responses.
Advanced Training Methodology: Utilizes a multi-stage training pipeline involving Supervised Fine-Tuning (SFT) on ~11k examples and a novel Curriculum-GRPO (Reinforcement Learning with difficulty-aware sampling) approach on ~7.3k examples. This includes specific reward functions for format validation, correctness (Bengali and English answer match), and ensuring high Bengali text percentage in reasoning.

Use Cases

This model is particularly well-suited for:

Bengali mathematical problem-solving applications.
Educational tools requiring step-by-step reasoning in Bengali.
Deployments where computational resources are limited, due to its small parameter count.

For more details, refer to the project page and the associated arXiv paper.

Overview

GanitLLM-0.6B_SFT_CGRPO: Bengali Mathematical Reasoning Model

Key Capabilities and Differentiators

Use Cases

Full Model Card (README)