Models
7,350
rghosh8ColdTools2B32K
arc-grpo-deepseek-r1-distill-qwen-1.5b-rajat-seed-42-G-4-new_merged
0
·6
·Apr 2026

xw1234ganColdTools2B32K
GRPO_KL_Qwen2.5-1.5B-Instruct_MATH_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
0
·6
·Apr 2026

xw1234ganColdTools8B32K
Merging_Prob_Qwen2.5-7B-Instruct_MATH_lr1e-05_mb2_ga128_n2048_seed42
0
·6
·Apr 2026

parkjoColdTools2B32K
Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_ent_0.001_USE_KL_0.001_resume_20260512_222805_step580
0
·6
·May 2026

