Name: coklatmanis886/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-foraging_docile_ibis API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: coklatmanis886

Model Overview

This model, coklatmanis886/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-foraging_docile_ibis, is a 0.5 billion parameter instruction-tuned language model. It is a fine-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model, developed by Gensyn. The fine-tuning process utilized the TRL library and specifically incorporated the GRPO (Gradient-based Reward Policy Optimization) training method.

Key Capabilities

Enhanced Mathematical Reasoning: The model's training with GRPO, a method introduced in the DeepSeekMath paper, suggests a focus on improving mathematical problem-solving abilities.
Instruction Following: As an instruction-tuned model, it is designed to respond effectively to user prompts and instructions.
Large Context Window: It supports a substantial context length of 32768 tokens, allowing for processing longer inputs and maintaining conversational coherence over extended interactions.

Training Details

The model was trained using TRL version 0.15.2, Transformers 4.51.3, Pytorch 2.5.1, Datasets 3.5.1, and Tokenizers 0.21.1. The GRPO method, central to its training, is detailed in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models."

When to Use This Model

This model is particularly suitable for applications requiring strong mathematical reasoning and precise instruction following within a compact parameter size. Its specialized training makes it a candidate for tasks where numerical accuracy and logical deduction are paramount.

Overview

Model Overview

Key Capabilities

Training Details

When to Use This Model

Full Model Card (README)