TMLR-Group-HF/Co-rewarding-II-Qwen3-8B-Base-OpenRS
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:mitArchitecture:Transformer0.0K Open Weights Cold
Co-rewarding-II/Qwen3-8B-Base-OpenRS is an 8 billion parameter language model developed by TMLR-Group-HF, based on the Qwen3-8B-Base architecture. It has been specifically trained using the OpenRS dataset, indicating an optimization for tasks related to recommender systems or reinforcement learning from human feedback. This model is designed for applications requiring a foundation model with specialized training on reward-based datasets, offering a 32768 token context length.
Loading preview...