CodeGoat24/UnifiedReward-2.0-qwen35-9b

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Mar 7, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

CodeGoat24/UnifiedReward-2.0-qwen35-9b is a 9 billion parameter unified reward model based on Qwen/Qwen3.5-9B, developed by CodeGoat24. This model is designed for multimodal understanding and generation assessment, supporting both pairwise ranking and pointwise scoring. It is specifically optimized for vision model preference alignment across image and video generation and understanding tasks.

Loading preview...

UnifiedReward-2.0-qwen35-9b Overview

UnifiedReward-2.0-qwen35-9b is a 9 billion parameter reward model developed by CodeGoat24, built upon the Qwen/Qwen3.5-9B architecture. This model introduces a unified approach to multimodal assessment, capable of both pairwise ranking and pointwise scoring. Its primary application is for preference alignment in vision models, evaluating both generated and understood content across various modalities.

Key Capabilities

  • Multimodal Assessment: Evaluates content across image generation, image understanding, video generation, and video understanding tasks.
  • Dual Scoring Methods: Supports both pairwise ranking (comparing two outputs) and pointwise scoring (assigning a score to a single output).
  • Vision Model Alignment: Specifically designed to align preferences for vision models, enhancing their performance and user satisfaction.
  • Broad Coverage: Unlike many specialized reward models, UnifiedReward-2.0 covers all four major multimodal assessment categories (image/video generation/understanding), as detailed in its comparative analysis.

Good For

  • Vision Model Developers: Ideal for researchers and developers working on vision models who need a robust mechanism for preference alignment and quality assessment.
  • Multimodal Content Evaluation: Useful for evaluating the quality and alignment of outputs from image and video generation models, as well as assessing the understanding capabilities of vision models.
  • Research in Reward Modeling: Provides a unified framework for multimodal reward modeling, offering a comprehensive solution compared to single-modality or single-method reward models. Further details are available in the associated paper and project page.