Name: Jason-hu/Qwen2.5-3B-GSM8K-GRPO-H200 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Jason-hu

Overview

Jason-hu/Qwen2.5-3B-GSM8K-GRPO-H200 is a specialized language model with 3.1 billion parameters, derived from the Qwen2.5-3B-Instruct base. It has been meticulously fine-tuned using the LoRA SFT (Low-Rank Adaptation for Supervised Fine-Tuning) method within the verl framework.

Key Capabilities

Mathematical Reasoning: The model is specifically trained on the GSM8K dataset, which focuses on grade school math word problems, enhancing its ability to understand and solve quantitative tasks.
Instruction Following: Inherits strong instruction-following capabilities from its Qwen2.5-3B-Instruct foundation.
Efficient Fine-tuning: Utilizes LoRA for efficient adaptation, making it a resource-effective solution for specialized tasks.

Good For

Mathematical Problem Solving: Ideal for applications requiring accurate solutions to arithmetic and word problems.
Educational Tools: Can be integrated into educational platforms for generating explanations or solving math homework.
Quantitative Analysis: Suitable for tasks where precise numerical reasoning is critical.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)