Name: wmln/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-dappled_wiry_pheasant API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wmln

Model Overview

This model, wmln/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-dappled_wiry_pheasant, is a specialized instruction-tuned language model with 0.5 billion parameters. It is built upon the Gensyn/Qwen2.5-0.5B-Instruct base model and has undergone further fine-tuning.

Key Training Details

The model was trained using the TRL (Transformer Reinforcement Learning) framework, specifically version 0.15.2. A notable aspect of its training procedure is the application of GRPO (Gradient Regularized Policy Optimization). This method, introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300), suggests an optimization for tasks involving mathematical reasoning.

Intended Use Cases

Given its fine-tuning with GRPO, this model is likely optimized for:

Mathematical Reasoning Tasks: Potentially offering enhanced performance in solving mathematical problems or understanding mathematical concepts.
Instruction Following: General instruction-tuned capabilities inherited from its base model.

Developers can quickly integrate this model using the transformers library for text generation tasks, as demonstrated in the quick start guide.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)