Name: starfrich/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-amphibious_leaping_bison API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: starfrich

Model Overview

This model, starfrich/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-amphibious_leaping_bison, is a 0.5 billion parameter instruction-tuned language model. It is a fine-tuned variant of the unsloth/Qwen2.5-0.5B-Instruct base model, developed by starfrich.

Key Differentiator: GRPO Training

A significant aspect of this model is its training methodology. It was fine-tuned using GRPO (Gradient Regularized Policy Optimization), a method detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This indicates a specialized focus on improving the model's mathematical reasoning and logical processing capabilities.

Technical Details

Base Model: unsloth/Qwen2.5-0.5B-Instruct
Parameter Count: 0.5 Billion
Training Framework: TRL (Transformer Reinforcement Learning)
Context Length: 131072 tokens

Use Cases

Given its GRPO-based training, this model is particularly well-suited for:

Mathematical problem-solving: Tasks requiring logical deduction and numerical reasoning.
Instruction following: Benefiting from its instruction-tuned nature.
Applications where enhanced reasoning is critical: Especially in domains that can leverage the GRPO method's strengths.

Quick Start Example

Users can quickly integrate and test the model using the transformers library:

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="starfrich/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-amphibious_leaping_bison", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Overview

Model Overview

Key Differentiator: GRPO Training

Technical Details

Use Cases

Quick Start Example

Full Model Card (README)