Models
20,679
xw1234ganColdTools3B32K
GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
0
·9
·Apr 2026

ccui46ColdTools8B32K
cookingworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_tformerPin_2500
0
·9
·Apr 2026

GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN

cookingworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_tformerPin_2500