Name: lhkhiem28/Qwen2.5-3B-ha_grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lhkhiem28

Overview

lhkhiem28/Qwen2.5-3B-ha_grpo is a 3.1 billion parameter language model, fine-tuned from the base Qwen/Qwen2.5-3B-Instruct model. Its primary distinction lies in its specialized training using the HA_GRPO method on the lhkhiem28/HA-GRPO-datasets. This training approach, introduced in the DeepSeekMath paper, focuses on significantly improving the model's mathematical reasoning abilities.

Key Capabilities

Enhanced Mathematical Reasoning: Specifically optimized for complex mathematical problem-solving and logical deduction, leveraging the HA_GRPO training methodology.
Instruction Following: Retains the instruction-following capabilities of its base Qwen2.5-3B-Instruct model.
Context Length: Supports a substantial context window of 32768 tokens, beneficial for multi-step reasoning tasks.

Good For

Applications requiring strong mathematical problem-solving.
Tasks involving logical reasoning and complex calculations.
Developers looking for a compact model with specialized mathematical prowess, building upon the Qwen2.5 architecture.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)