LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens-seed_2
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

The LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens-seed_2 model is a 0.8 billion parameter language model based on the Qwen3 architecture. This model is a general reward model, likely designed for evaluating and scoring responses from other language models. Its primary application is in reinforcement learning from human feedback (RLHF) pipelines or similar reward-based optimization tasks.

Loading preview...