Models
10,819
LorenaYannnnnWarm800M32K
Qwen3-0.6B-g_general_reward_e_sycophancy-seed_0-sky_r_weak_syco
0
·372
·Apr 2026

process-reward-agentsWarm4B32K
Qwen3-4B-Instruct-2507_SFT_all_docs_bs2x2_lr3e-05_20260420_140000_epoch_3
0
·371
·Apr 2026

Qwen3-0.6B-g_general_reward_e_sycophancy-seed_0-sky_r_weak_syco

Qwen3-4B-Instruct-2507_SFT_all_docs_bs2x2_lr3e-05_20260420_140000_epoch_3