Name: tuteeee/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-carnivorous_pensive_salmon API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tuteeee

Overview

This model, tuteeee/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-carnivorous_pensive_salmon, is a 0.5 billion parameter instruction-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model. It has been specifically fine-tuned using the GRPO (Gradient-based Reward Policy Optimization) method, a technique highlighted in the DeepSeekMath paper for its effectiveness in pushing the limits of mathematical reasoning in open language models. The training was conducted using the TRL framework.

Key Capabilities

Enhanced Mathematical Reasoning: Leverages the GRPO training method to improve performance on tasks requiring mathematical reasoning.
Instruction Following: Designed to follow instructions effectively due to its instruction-tuned nature.
Compact Size: At 0.5 billion parameters, it offers a smaller footprint while aiming for specialized reasoning improvements.
Extended Context Window: Supports a context length of 32768 tokens, allowing for processing longer inputs.

Good for

Applications requiring a compact model with improved mathematical reasoning abilities.
Tasks where instruction following is crucial and a smaller model size is advantageous.
Research and experimentation with GRPO-trained models for specialized reasoning tasks.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)