dipta007/GanitLLM-4B_SFT_CGRPO
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

GanitLLM-4B_SFT_CGRPO by dipta007 is a 4 billion parameter causal language model based on Qwen3-4B, specifically fine-tuned for mathematical reasoning in Bengali. It utilizes Supervised Fine-Tuning (SFT) and a novel Curriculum-GRPO approach to significantly improve accuracy on Bengali mathematical benchmarks. The model demonstrates enhanced Bengali reasoning capabilities and generates substantially more concise solutions compared to its base model, making it ideal for complex mathematical problem-solving in Bengali.

Loading preview...