jordanpainter/dialect-qwen-gspo-all
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 2, 2026Architecture:Transformer Cold

The jordanpainter/dialect-qwen-gspo-all is an 8 billion parameter language model, fine-tuned from jordanpainter/DialLM-Qwen-sft-all using the TRL framework. This model incorporates the GRPO training method, as introduced in the DeepSeekMath paper, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring advanced reasoning, particularly in areas where structured problem-solving is beneficial. This model is suitable for applications demanding robust logical inference and problem-solving skills.

Loading preview...