XueZhang-bjtu/M-Thinker-1.5B-Iter2
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Oct 14, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

M-Thinker-1.5B-Iter2 is a 1.5 billion parameter large reasoning model developed by Xue Zhang and colleagues, designed to enhance multilingual reasoning capabilities. It addresses limitations in non-English language processing by improving input-output language consistency and reasoning path quality. Trained with a novel GRPO algorithm incorporating Language Consistency (LC) and Cross-lingual Thinking Alignment (CTA) rewards, this model excels at complex reasoning tasks across various languages, achieving high language consistency and superior performance on multilingual benchmarks like MMATH and PolyMath.

Loading preview...