Models
11,002
meteorainWarm4B32K
Qwen_Qwen3-4B-Thinking-2507_int4-g16-fp8_openr1-default-concat_2048_8_1024_256_lr0.03
0
·167
·May 2026

jackf857Warm8B8K
llama-3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.4-s_star-0.5
0
·166
·Apr 2026

parkjoWarm2B32K
Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_ent_0.001_USE_KL_0.001_resume_20260512_222805_step580
0
·165
·May 2026

shengjia-torontoWarm2B32K
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-step420-aime24-34_3-temp1
0
·165
·May 2026

