LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_self-seed_2
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 15, 2026Architecture:Transformer Warm
The LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_self-seed_2 is a 0.8 billion parameter language model based on the Qwen3 architecture, featuring a substantial 32,768 token context length. This model is a self-seeded reward model, indicating its development for evaluating and guiding other language models. Its primary utility lies in reinforcement learning from human feedback (RLHF) pipelines, where it can assess response quality.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–