Name: yikeee/Open-Reward-Agent-sft-rubric-only API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: yikeee

Model Overview

The yikeee/Open-Reward-Agent-sft-rubric-only is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. Its training specifically utilized the open_reward_agent_rubric_sft_mix dataset, suggesting a specialization in tasks involving reward agent rubrics.

Key Characteristics

Base Model: Qwen/Qwen3-8B
Parameter Count: 8 billion
Context Length: 32768 tokens, enabling the processing of substantial input texts.
Specialized Training: Fine-tuned on a dataset focused on reward agent rubrics, indicating potential for enhanced performance in related applications.

Training Details

The model was trained with a learning rate of 4e-05, a total batch size of 64 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 16), and for 3 epochs. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling. This configuration suggests a focused effort to adapt the base Qwen3-8B model to specific rubric-based tasks.

Intended Use Cases

While specific intended uses and limitations are not detailed in the provided information, the model's fine-tuning on a reward agent rubric dataset implies its utility in applications requiring the understanding, generation, or evaluation of content based on predefined rubrics, particularly within the context of reward systems or agent behavior assessment.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)