CodeGoat24/UnifiedReward-Flex-qwen3vl-32b

VISIONConcurrency Cost:2Model Size:33.4BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Feb 2, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

CodeGoat24/UnifiedReward-Flex-qwen33vl-32b is a unified personalized reward model for vision generation developed by CodeGoat24. This model integrates reward modeling with flexible and context-adaptive reasoning, specifically designed to enhance vision generation tasks. It features updated model weights and enhanced training data to mitigate position bias issues. The model is intended for applications requiring advanced personalized reward mechanisms in visual content creation.

Loading preview...

Overview

CodeGoat24/UnifiedReward-Flex-qwen3vl-32b is a specialized model developed by CodeGoat24, focusing on unified personalized reward modeling for vision generation. It distinguishes itself by coupling reward mechanisms with flexible and context-adaptive reasoning, aiming to improve the quality and relevance of generated visual content.

Key Capabilities

  • Unified Personalized Reward Modeling: Designed to provide personalized feedback for vision generation tasks.
  • Flexible and Context-Adaptive Reasoning: Incorporates advanced reasoning capabilities that adapt to different contexts.
  • Bias Mitigation: Features updated model weights and enhanced training data specifically to address and mitigate position bias issues in generated outputs.

Use Cases

This model is particularly well-suited for applications in:

  • Vision Generation: Enhancing the quality and personalization of generated images or visual content.
  • Personalized AI Systems: Developing systems that require adaptive and personalized feedback loops for visual tasks.

For more technical details, the associated research paper is available here and the inference code can be found on GitHub.