Name: WSX/Qwen2.5-1.5B-Open-R1-GRPO-FC API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: WSX

Model Overview

WSX/Qwen2.5-1.5B-Open-R1-GRPO-FC is a 1.5 billion parameter language model developed by WSX. It is a fine-tuned variant of the Qwen2.5 architecture, specifically optimized for mathematical reasoning. The model's training incorporated the GRPO (Gradient Regularized Policy Optimization) method, a technique introduced in the DeepSeekMath paper, which aims to push the limits of mathematical reasoning in open language models.

Key Capabilities

Enhanced Mathematical Reasoning: Fine-tuned on the AI-MO/NuminaMath-TIR dataset, this model is designed to excel at complex mathematical problems and logical deduction.
GRPO Training: Utilizes the GRPO method, known for improving mathematical problem-solving performance.
Compact Size: At 1.5 billion parameters, it offers mathematical reasoning capabilities in a relatively small footprint.

Good For

Applications requiring strong mathematical problem-solving.
Research and development in improving LLM performance on quantitative tasks.
Scenarios where a smaller, specialized model for mathematical reasoning is preferred over larger, general-purpose models.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)