Name: bmysec/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-tiny_flapping_ant API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: bmysec

Model Overview

The bmysec/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-tiny_flapping_ant is an instruction-tuned language model derived from the Gensyn/Qwen2.5-0.5B-Instruct base. This model has undergone specialized training using the TRL (Transformer Reinforcement Learning) framework, specifically employing the GRPO (Generalized Reinforcement Learning with Policy Optimization) method.

Key Training Details

Base Model: Gensyn/Qwen2.5-0.5B-Instruct
Fine-tuning Method: GRPO, as introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This suggests a focus on improving reasoning abilities, particularly in mathematical contexts.
Frameworks Used: TRL (version 0.15.2), Transformers (version 4.48.2), Pytorch (version 2.5.1), Datasets (version 3.6.0), and Tokenizers (version 0.21.1).

Usage

This model is designed for instruction-following tasks. A quick start example using the transformers pipeline is provided for text generation, demonstrating how to query the model with a user prompt and retrieve generated text.

Overview

Model Overview

Key Training Details

Usage

Full Model Card (README)