Name: u00y/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mammalian_tenacious_narwhal API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: u00y

Model Overview

This model, u00y/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mammalian_tenacious_narwhal, is a specialized instruction-tuned language model with 0.5 billion parameters. It is a fine-tuned variant of the Gensyn/Qwen2.5-0.5B-Instruct base model.

Key Training Details

Fine-tuning Framework: The model was trained using the TRL library, a popular framework for Transformer Reinforcement Learning.
Optimization Method: A significant differentiator for this model is its training with GRPO (Gradient-based Reward Policy Optimization). This method, introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300), suggests an emphasis on improving mathematical reasoning abilities.

Intended Use

Given its fine-tuning with the GRPO method, this model is likely optimized for:

Mathematical Reasoning: Tasks that involve complex calculations, logical deductions, and problem-solving in mathematical contexts.
Instruction Following: As an instruction-tuned model, it is designed to respond effectively to user prompts and instructions.

Framework Versions

TRL: 0.15.2
Transformers: 4.51.1
Pytorch: 2.5.1
Datasets: 3.5.0
Tokenizers: 0.21.1

Overview

Model Overview

Key Training Details

Intended Use

Framework Versions

Full Model Card (README)