Name: Axelerate/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-flexible_bold_butterfly API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Axelerate

Overview

This model, Axelerate/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-flexible_bold_butterfly, is an instruction-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model. It has been fine-tuned using the TRL library and incorporates the GRPO (Gradient-based Reward Policy Optimization) training method. The GRPO method, detailed in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models," suggests an optimization for tasks requiring robust mathematical and logical reasoning.

Key Capabilities

Instruction Following: Inherits and refines the instruction-following abilities of the Qwen2.5-Instruct series.
Enhanced Reasoning: Benefits from GRPO training, which is associated with improved mathematical reasoning in language models.

Training Details

The model was trained with specific versions of key frameworks:

TRL: 0.15.2
Transformers: 4.48.2
Pytorch: 2.5.1
Datasets: 3.6.0
Tokenizers: 0.21.1

Good For

Applications requiring a compact instruction-tuned model with a focus on logical or mathematical problem-solving.
Scenarios where the base Qwen2.5-0.5B-Instruct model's reasoning capabilities need a boost through specialized fine-tuning.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)