LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_2
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 21, 2026Architecture:Transformer Warm
The LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_2 is a 0.8 billion parameter language model based on the Qwen3 architecture. This model is a general reward model, likely optimized for evaluating and scoring responses in various natural language processing tasks. Its primary use case is to provide feedback or preference signals for other language models, contributing to their alignment and performance improvement.
Loading preview...