CodeGoat24/UnifiedReward-Edit-qwen3vl-8b

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Nov 11, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

CodeGoat24/UnifiedReward-Edit-qwen3vl-8b is an 8 billion parameter unified reward model developed by CodeGoat24, specifically designed for evaluating both Text-to-Image and Image-to-Image generation tasks. This model excels at assessing image editing quality, supporting pairwise ranking, pairwise scoring, and pointwise scoring based on instruction-following and overall image quality. It is optimized for providing comprehensive feedback on generated and edited images.

Loading preview...

UnifiedReward-Edit-qwen3vl-8b: A Specialized Reward Model for Image Generation and Editing

CodeGoat24/UnifiedReward-Edit-qwen3vl-8b is an 8 billion parameter reward model developed by CodeGoat24, uniquely designed to evaluate the quality of both Text-to-Image (T2I) and Image-to-Image (I2I) generation, with a strong focus on image editing tasks. This model provides a unified framework for assessing visual content.

Key Capabilities

  • Image Editing Evaluation: Specifically supports three distinct methods for judging edited images:
    • Pairwise Rank: Determines which of two edited images is superior.
    • Pairwise Score: Assigns individual scores to each image within a pair.
    • Pointwise Score: Rates a single image across two critical axes: adherence to instructions and overall image quality.
  • Unified Reward System: Extends its evaluation capabilities beyond editing to general T2I generation, leveraging its comprehensive understanding of visual quality and instruction following.

Training and Resources

The model's image editing capabilities are enhanced by training data preprocessed from the EditScore and EditReward datasets. Developers can find the image editing reward inference code in the UnifiedReward-Edit/ directory within the UnifiedReward repository.

Good for

  • Automated Image Quality Assessment: Ideal for developers needing to automatically evaluate the output of image generation and editing models.
  • Reinforcement Learning from Human Feedback (RLHF): Can serve as a robust reward signal for fine-tuning generative models based on human preferences for image quality and instruction adherence.
  • Benchmarking and Research: Useful for researchers and practitioners comparing different image editing algorithms or T2I models by providing objective, quantitative feedback.