z0406_rt_ordinary_RT_quirk_0_lr2e-5
Gemma-3-4B-IT-TL-SynthDolly-1A-E5
Gemma-3-4B-IT-TL-SynthDolly-1A-E8
z0406_rt_ordinary_RT_quirk_0_lr5e-5
z0406_rt_sam_RT_backdoor_0_lr3e-5_rho0.005
z0406_rt_sam_RT_backdoor_0_lr3e-5_rho0.01
z0406_rt_ordinary_RT_quirk_0_lr1e-4
z0406_rt_sam_RT_backdoor_0_lr3e-5_rho0.02
z0406_rt_sam_RT_backdoor_1_lr3e-5_rho0.01
z0406_rt_sam_RT_backdoor_1_lr3e-5_rho0.02
z0406_rt_ordinary_RT_backdoor_1_lr1e-4
z0406_rt_ordinary_RT_backdoor_1_lr2e-5
z0406_rt_ordinary_RT_backdoor_1_lr5e-5
z0406_rt_ordinary_RT_quirk_1_lr2e-5
z0406_rt_ordinary_RT_backdoor_0_lr5e-5
z0406_rt_ordinary_RT_backdoor_0_lr2e-5
z0406_rt_ordinary_RT_quirk_1_lr1e-4
z0406_rt_ordinary_RT_backdoor_0_lr1e-4
instruct_math_rl
Gemma-3-4B-IT-HI-SynthDolly-1A-E1
Gemma-3-4B-IT-DA-SynthDolly-1A-E1
Gemma-3-4B-IT-ZH-SynthDolly-1A-E1
k10-lr5e-7-ema0.01-qwen3-4b-think-essay_sensitive20pct-pos_gap20pct
Gemma-3-4B-IT-ES-SynthDolly-1A-E1
Gemma-3-4B-IT-EL-SynthDolly-1A-E1
Gemma-3-4B-IT-PT-SynthDolly-1A-E1
Gemma-3-4B-IT-TL-SynthDolly-1A-E1
Gemma-3-4B-IT-DA-SynthDolly-1A-E3
qwen3-4b-agrpo-nothink-lr3e-6
Gemma-3-4B-IT-TL-SynthDolly-1A-E3
Qwen3-4B-Base-ftjob-25058cdbbe3e-merged
Gemma-3-4B-IT-ES-SynthDolly-1A-E3
Qwen3-4B-Tulu-SFT-Dolci-Reasoning-100k
cold-start-alfworld-safety-sft-qwen-4b-1-global-step-171
llm4routing
qwen3-4b-sql
CodeRM-GRPO-4B-bs96-nrp-step110-merged
qwen3_30b_a3b_to_4b_onpolicy_5k_src20k-25k
Qwen3-4B-Instruct-2507-ftjob-e3f6e890af59
bs16-k10-lr5e-7-ema0.01-eopd0.8-qwen3-4b-think-essay_bottom20_nogap-maxsteps150
Qwen3-4B-Instruct-2507-ftjob-c6534a30ef1e
Qwen3-4B-Instruct-2507-ftjob-6ff45aa40dda