Name: XlHoWcLGeuQ/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-burrowing_voracious_bear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: XlHoWcLGeuQ

Model Overview

This model, XlHoWcLGeuQ/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-burrowing_voracious_bear, is a 0.5 billion parameter instruction-tuned language model. It is a fine-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model, developed to enhance specific capabilities.

Key Training Details

The model was trained using the TRL (Transformer Reinforcement Learning) framework. A notable aspect of its training procedure is the application of GRPO (Gradient Regularized Policy Optimization). This method, detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models," suggests an optimization for tasks involving mathematical reasoning.

Capabilities and Use Cases

Given its training with the GRPO method, this model is particularly suited for applications that benefit from improved mathematical reasoning. Its instruction-tuned nature makes it capable of following user prompts effectively. Developers can leverage this model for tasks where a smaller, specialized model with enhanced mathematical understanding is advantageous, especially within its 32768 token context window.

Overview

Model Overview

Key Training Details

Capabilities and Use Cases

Full Model Card (README)