Name: coinex/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-aquatic_armored_okapi API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: coinex

Model Overview

The coinex/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-aquatic_armored_okapi is an instruction-tuned language model based on the Gensyn/Qwen2.5-0.5B-Instruct architecture. This model has been further fine-tuned using the TRL (Transformer Reinforcement Learning) framework.

Key Training Details

Base Model: Gensyn/Qwen2.5-0.5B-Instruct
Fine-tuning Framework: TRL (version 0.15.2)
Training Method: Incorporates GRPO (Gradient Regularized Policy Optimization), a method introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This suggests a focus on improving mathematical reasoning abilities.

Intended Use Cases

This model is particularly suitable for applications requiring:

Instruction following: As an instruction-tuned model, it is designed to respond effectively to user prompts.
Mathematical reasoning tasks: The integration of the GRPO method indicates an optimization for handling mathematical problems and logical reasoning.
Lightweight deployments: With 0.5 billion parameters, it offers a balance between performance and computational efficiency, making it suitable for environments with limited resources.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)