xfey/Qwen2.5-7B-Whitebox-GSM8k-Exp

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jul 3, 2025Architecture:Transformer Cold

The xfey/Qwen2.5-7B-Whitebox-GSM8k-Exp is a 7.6 billion parameter language model, fine-tuned from an unspecified base model specifically for mathematical reasoning tasks. It leverages the GRPO method, introduced in the DeepSeekMath paper, to enhance its performance on complex mathematical problem-solving. This model is primarily optimized for numerical and logical reasoning, making it suitable for applications requiring strong mathematical capabilities.

Loading preview...

Model Overview

The xfey/Qwen2.5-7B-Whitebox-GSM8k-Exp is a 7.6 billion parameter language model specifically fine-tuned for mathematical reasoning. It was trained on the openai/gsm8k dataset, which is a collection of grade school math word problems, using the TRL (Transformer Reinforcement Learning) framework.

Key Capabilities

  • Enhanced Mathematical Reasoning: The model incorporates the GRPO (Gradient-based Reward Policy Optimization) method, a technique designed to push the limits of mathematical reasoning in open language models, as detailed in the DeepSeekMath paper.
  • Problem-Solving Focus: Its training on the GSM8k dataset indicates a strong specialization in solving arithmetic and logical word problems.
  • TRL Framework: Utilizes the TRL library for its training procedure, suggesting potential for further reinforcement learning-based optimizations.

Good For

  • Mathematical Problem Solving: Ideal for tasks requiring the model to understand and solve mathematical word problems.
  • Educational Applications: Can be used in tools for teaching or assessing mathematical skills.
  • Research in Mathematical Reasoning: Provides a specialized base for further research into improving LLM performance on quantitative tasks.