Name: dipta007/GanitLLM-1.7B_CGRPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: dipta007

GanitLLM-1.7B_CGRPO: Bengali Mathematical Reasoning

GanitLLM-1.7B_CGRPO is a 1.7 billion parameter causal language model developed by dipta007, built upon the Qwen3-1.7B base model. It is specifically trained using a Curriculum-GRPO (Curriculum-Guided Reinforcement Learning from Policy Optimization) approach, directly applied to the base model without an initial Supervised Fine-Tuning (SFT) stage.

Key Capabilities & Performance

This model excels in Bengali mathematical reasoning tasks, demonstrating substantial performance gains:

+44.4 accuracy on the Bn-MGSM benchmark, increasing from 15.2 to 59.6.
+52.1 accuracy on the Bn-MSVAMP benchmark, improving from 14.1 to 66.2.
Generates solutions with 10.9% fewer tokens (1002 words average vs. 1124 for the base model), indicating more concise reasoning.

Important Note on Reasoning Language

While designed for Bengali mathematical problems, this specific variant of GanitLLM-1.7B_CGRPO primarily performs its reasoning steps in English. Its Bengali reasoning percentage is similar to the base model (18.74%). For models that reason in Bengali, users are directed to the GanitLLM-1.7B_SFT_CGRPO variant.

Training Methodology

The model was trained using a single-stage Curriculum-GRPO pipeline on the GANIT-RLVR dataset (~7.3k examples). Reward functions during training included:

Format Reward: Validating the structure of <think> and <answer> tags.
Correctness Reward: Awarding +2.0 for Bengali answer matches and +1.0 for English matches.
Bengali Reasoning Reward: Encouraging >80% Bengali text in reasoning (though this model's output is predominantly English).

Use Cases

This model is ideal for applications requiring high-accuracy mathematical problem-solving in a Bengali context, where the intermediate reasoning steps can be in English. It offers a compact and efficient solution for improving mathematical benchmark scores.

Overview

GanitLLM-1.7B_CGRPO: Bengali Mathematical Reasoning

Key Capabilities & Performance

Important Note on Reasoning Language

Training Methodology

Use Cases

Full Model Card (README)