Name: AmirMohseni/qwen-2.5-math-1.5b-dsr-sub-v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AmirMohseni

Model Overview

This model, AmirMohseni/qwen-2.5-math-1.5b-dsr-sub-v2, is a specialized 1.5 billion parameter language model fine-tuned from the base Qwen/Qwen2.5-Math-1.5B model. It has been trained using the TRL framework.

Key Capabilities

Enhanced Mathematical Reasoning: The model's primary focus is on mathematical tasks, benefiting from a fine-tuning process that incorporates the GRPO (Gradient-based Reward Policy Optimization) method.
GRPO Training: Training utilized the GRPO method, as introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300), to improve its mathematical problem-solving abilities.
Qwen2.5 Architecture: Built upon the Qwen2.5 family, it inherits a robust base for language understanding and generation.

When to Use This Model

This model is particularly well-suited for use cases that require:

Mathematical Problem Solving: Ideal for applications demanding accurate and robust mathematical reasoning.
Research in Mathematical LLMs: Useful for researchers exploring advanced training techniques like GRPO for specialized domains.

Technical Details

The model was trained with specific framework versions including TRL 0.22.0.dev0, Transformers 4.55.4, Pytorch 2.7.1, Datasets 4.0.0, and Tokenizers 0.21.4.

Overview

Model Overview

Key Capabilities

When to Use This Model

Technical Details

Full Model Card (README)