Name: teetone/RoboReward-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: teetone

RoboReward-8B: Vision-Language Reward Model for Robotics

RoboReward-8B is an 8 billion parameter vision-language model developed by teetone, designed to provide general-purpose reward signals for robotic tasks. Built on the Qwen-3 VL architecture, it is trained using the RoboReward dataset to analyze real-robot rollout videos.

Key Capabilities

Discrete Progress Prediction: Given a task instruction and a robot rollout video, the model predicts a discrete end-of-episode progress score from 1 to 5.
Robotic Task Evaluation: It assesses the final state of a robotic action against a given task, providing a quantitative measure of success or failure.
Vision-Language Integration: Combines visual information from videos with textual task instructions to understand and evaluate robotic performance.

Reward Rubric

The model uses a specific rubric to assign scores:

1 - No Success: No goal-relevant change.
2 - Minimal Progress: Small, insufficient change.
3 - Partial Completion: Good progress, but major violations or multiple minor ones.
4 - Near Completion: Correct intent, but a single minor requirement missed.
5 - Perfect Completion: All requirements satisfied.

Use Cases

This model is particularly suited for:

Robotic Reinforcement Learning: Providing reward signals for training robotic agents.
Automated Robotic Evaluation: Objectively scoring robotic task performance without human intervention.
Robotics Research: Aiding in the development and analysis of robotic control policies.

Overview

RoboReward-8B: Vision-Language Reward Model for Robotics

Key Capabilities

Reward Rubric

Use Cases

Full Model Card (README)