Thrillcrazyer/Qwen-7B_NOTAC_GSPO
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jan 6, 2026Architecture:Transformer Cold
Thrillcrazyer/Qwen-7B_NOTAC_GSPO is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B-Instruct. It specializes in mathematical reasoning, having been trained on the DeepMath-103k dataset using the GRPO method. This model is optimized for complex mathematical problem-solving and related reasoning tasks, offering a context length of 131072 tokens.
Loading preview...
Overview
Thrillcrazyer/Qwen-7B_NOTAC_GSPO is a 7.6 billion parameter language model derived from the Qwen2.5-7B-Instruct architecture. Its primary distinction lies in its specialized training for mathematical reasoning, achieved through fine-tuning on the DeepMath-103k dataset.
Key Capabilities
- Enhanced Mathematical Reasoning: The model was trained using the GRPO (Generalized Reinforcement Learning with Policy Optimization) method, as introduced in the DeepSeekMath paper, specifically to improve its ability to handle complex mathematical problems.
- Instruction Following: As a fine-tuned version of an instruction-tuned model, it retains strong instruction-following capabilities.
- Large Context Window: Supports a substantial context length of 131072 tokens, beneficial for multi-step reasoning and complex problem descriptions.
Good For
- Mathematical Problem Solving: Ideal for applications requiring advanced mathematical reasoning, calculations, and logical deduction.
- Research in Mathematical AI: Useful for researchers exploring methods to improve LLM performance on quantitative tasks.
- Educational Tools: Can be integrated into systems designed to assist with or generate solutions for mathematical challenges.