Models
6,720
W-61Warm8B32K
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.45-s_star-0.45-20260430-143919
0
·182
·Apr 2026

W-61Warm8B32K
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.43-s_star-0.3-20260430-192039
0
·182
·Apr 2026

cosmos1030Warm800M32K
c1899de289a04d12100db370d81485cdf75e47ca-elsa-hybrid-kd-s50pct-lr5e-5-lmda5e-3
0
·182
·Apr 2026

parkjoWarm8B32K
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_kl_0.001_20260516_140637_step290
0
·182
·May 2026

jackf857Warm8B32K
qwen3-8b-base-epsilon-dpo-hh-harmless-4xh200-batch-64-20260424-040415
0
·181
·Apr 2026

W-61Warm8B8K
llama-3-8b-base-new-dpo-hh-helpful-4xh200-batch-64-q_t-0.45-s_star-0.4-eta-0.3
0
·181
·Apr 2026

W-61Warm8B8K
llama-3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.45-s_star-0.35-20260428-045924
0
·181
·Apr 2026

doupariWarm8B32K
llama3.1_8b_sft-llopa-k24-no_system-nemotron-math-high.math.q60000-llopa-k24-no_system
0
·181
·Apr 2026
