Name: rubricreward/R3-Qwen3-4B-14k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rubricreward

R3-Qwen3-4B-14k: A Rubric-Agnostic Reward Model

R3-Qwen3-4B-14k is a 4 billion parameter reward model developed by rubricreward, fine-tuned from the Qwen3-4B base model. It is a key component of the R3 (Robust Rubric-Agnostic Reward Models) family, designed to provide robust and detailed evaluations.

Key Capabilities

Rubric-Agnostic Evaluation: Trained on a unique R3 dataset compiled from 45 diverse sources, enabling it to evaluate responses across various tasks without being tied to a single rubric format.
Comprehensive Assessment: Each training example includes an instruction, task description, input, response(s), evaluation rubrics, a score, and corresponding reasoning, allowing the model to generate detailed assessments.
Task Versatility: The training dataset covers a broad spectrum of tasks, including classification, preference optimization, and question answering, enhancing its adaptability to different evaluation scenarios.
English Language Support: Primarily focused on English language processing for evaluation tasks.

Good For

Automated Feedback Systems: Ideal for systems requiring automated evaluation of generated text based on specific criteria and rubrics.
Preference Optimization: Can be used in scenarios where ranking or preferring one response over another is necessary, supported by detailed reasoning.
Quality Assurance: Suitable for assessing the quality and adherence of responses to given instructions and evaluation guidelines.

Overview

R3-Qwen3-4B-14k: A Rubric-Agnostic Reward Model

Key Capabilities

Good For

Full Model Card (README)