AIJian/PaTaRM-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 19, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

AIJian/PaTaRM-8B is an 8 billion parameter language model from the PaTaRM series, based on the Qwen3-8B architecture. Developed by Ai Jian and collaborators, this model focuses on Preference-Aware Task-Adaptive Reward Modeling, bridging pairwise and pointwise signals. It is designed for advanced reward modeling tasks, as detailed in its associated arXiv paper.

Loading preview...

PaTaRM-8B Overview

PaTaRM-8B is an 8 billion parameter model within the PaTaRM series, built upon the Qwen3-8B base architecture. This model is part of a research effort by Ai Jian and colleagues, focusing on Preference-Aware Task-Adaptive Reward Modeling (PaTaRM). The core innovation lies in its approach to integrating both pairwise and pointwise signals to enhance reward modeling, as outlined in the associated research paper arXiv:2510.24235.

Key Characteristics

  • Architecture: Based on the Qwen3-8B model.
  • Parameter Count: 8 billion parameters.
  • Research Focus: Bridging pairwise and pointwise signals for improved reward modeling.
  • Series: Part of the broader PaTaRM model collection, which also includes PaTaRM-14B.

Potential Use Cases

  • Advanced Reward Modeling: Ideal for research and applications requiring sophisticated preference learning and reward signal integration.
  • Preference-Based Systems: Suitable for tasks where understanding and modeling user or system preferences are crucial.
  • Research & Development: A valuable resource for researchers exploring novel approaches in reinforcement learning from human feedback (RLHF) and preference optimization.