Name: chinna6/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scaly_padded_macaw API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: chinna6

Overview

This model, chinna6/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scaly_padded_macaw, is a 0.5 billion parameter instruction-tuned language model. It is a fine-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model, developed by Gensyn. The fine-tuning process utilized the TRL library and specifically incorporated the GRPO (Gradient-based Reward Policy Optimization) method.

Key Capabilities

Enhanced Mathematical Reasoning: The model's training with GRPO is based on the methodology introduced in the DeepSeekMath paper, aiming to improve its ability to handle mathematical reasoning tasks.
Instruction Following: As an instruction-tuned model, it is designed to follow user prompts and instructions effectively.
Extended Context Window: It supports a substantial context length of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended responses.

Training Details

The model was trained using TRL version 0.15.2, with Transformers 4.48.2, PyTorch 2.5.1, Datasets 3.6.0, and Tokenizers 0.21.1. The GRPO method, detailed in the DeepSeekMath research paper, was a core component of its training procedure.

Good For

Applications requiring a compact model with improved mathematical reasoning.
Instruction-following tasks where a longer context window is beneficial.
Exploration of models fine-tuned with advanced reinforcement learning techniques like GRPO.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)