Name: yahid/triage-agent-qwen3b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: yahid

Overview

The yahid/triage-agent-qwen3b is a 3.1 billion parameter instruction-tuned language model, building upon the Qwen/Qwen2.5-3B-Instruct architecture. It has been specifically fine-tuned using the TRL library and incorporates the GRPO (Generative Reinforcement Learning with Policy Optimization) training method.

Key Capabilities

Enhanced Reasoning: The model's training with GRPO, a method introduced in the DeepSeekMath paper, suggests a focus on improving mathematical and general reasoning abilities.
Instruction Following: As an instruction-tuned model, it is designed to understand and execute user prompts effectively.
Large Context Window: Benefits from a 32768-token context length, allowing it to process and generate longer, more complex sequences of text.

Training Details

The model was trained using GRPO, a technique highlighted for its effectiveness in mathematical reasoning tasks. The training procedure leveraged specific versions of key frameworks:

TRL: 1.2.0
Transformers: 4.57.6
Pytorch: 2.10.0
Datasets: 4.8.4
Tokenizers: 0.22.2

Use Cases

This model is suitable for applications requiring strong reasoning capabilities and accurate instruction following, particularly in scenarios where the GRPO training method's benefits in mathematical or logical tasks could be advantageous.

Overview

Overview

Key Capabilities

Training Details

Use Cases

Full Model Card (README)