jordanpainter/dialect-qwen-gspo-brit
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 3, 2026Architecture:Transformer Cold
The jordanpainter/dialect-qwen-gspo-brit model is an 8 billion parameter language model, fine-tuned from jordanpainter/diallm-qwen-sft-brit. It was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on enhancing mathematical reasoning. This model is specialized for tasks requiring improved reasoning capabilities, particularly those benefiting from GRPO's optimization approach.
Loading preview...