Models
10,988
xw1234ganWarm2B32K
GRPO_KL_Qwen2.5-1.5B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
0
·44
·Apr 2026

ligeng-devWarm8B32K
tw-data-train_final_replaced_from_classified-fix-format-8node-resume
0
·43
·Apr 2026

GRPO_KL_Qwen2.5-1.5B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN

tw-data-train_final_replaced_from_classified-fix-format-8node-resume