Name: divelab/DAPO_E2H-math-cosine API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: divelab

Overview

The divelab/DAPO_E2H-math-cosine model is a specialized 1.5 billion parameter language model derived from the Qwen/Qwen2.5-1.5B-Instruct architecture. It has been meticulously fine-tuned on the MATH dataset using the E2H (Easy to Hard Reasoning) training framework, built on top of Hugging Face's TRL library. This model is specifically engineered to excel in mathematical reasoning tasks.

Key Capabilities

Enhanced Mathematical Reasoning: Optimized for solving complex mathematical problems through fine-tuning on the MATH dataset.
GRPO Integration: Incorporates the GRPO method, as introduced in the DeepSeekMath paper, to push the limits of mathematical reasoning.
Instruction Following: Retains instruction-following capabilities from its base Qwen2.5-1.5B-Instruct model.

Good For

Applications requiring robust mathematical problem-solving.
Research and development in mathematical reasoning with LLMs.
Tasks benefiting from models trained with curriculum reinforcement learning (E2H).

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)