Name: pngwn/qwen2.5-0.5b-gsm8k-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: pngwn

Model Overview

The pngwn/qwen2.5-0.5b-gsm8k-sft is a 0.5 billion parameter model built upon the Qwen2.5 architecture. Its primary distinction lies in its supervised fine-tuning (SFT) specifically for grade-school mathematical reasoning using the openai/gsm8k dataset.

Key Capabilities & Performance

This model excels at solving arithmetic word problems, as evidenced by its performance on the GSM8K benchmark. It achieves an exact-match accuracy of 0.3472 (458/1319) on the GSM8K test set, a substantial improvement over the base Qwen/Qwen2.5-0.5B model's accuracy of 0.0008. This specialization makes it highly effective for tasks requiring precise numerical problem-solving.

Training Details

The model was trained for 3 epochs on 7473 samples from the openai/gsm8k dataset, with a maximum sequence length of 1024 tokens. Decoding was performed greedily, and answers were extracted using a specific regex pattern.

Use Cases

This model is particularly well-suited for:

Educational applications: Assisting students with math homework or generating practice problems.
Automated grading: Evaluating solutions to mathematical problems.
Specialized reasoning tasks: Any application where accurate, step-by-step mathematical problem-solving is critical, especially at a foundational level.

Overview

Model Overview

Key Capabilities & Performance

Training Details

Use Cases

Full Model Card (README)