general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4000
w6g927rr
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-3500
Main_fixed_MATH_7B_step_5
Main_fixed_MATH_7B_step_10
Main_fixed_MATH_7B_step_9
Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint500
acquisition_llama-3_1-8b_bins_medmcqa_format
Qwen3-1.7B-Base
Qwen3-4B-magr-0.01
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-5000
diallm-llama-dpo-all
Main_fixed_MATH_7B_step_8
nemotron-terminal-scientific_computing__Qwen3-8B
diallm-qwen-dpo-aus
Main_fixed_MATH_7B_step_6
qwen3-st2
Main_fixed_MATH_7B_step_3
NuminaMath_Main_fixed_SFTanchor_1_5B_step_1
QwenRolina3-06B-base-LR1e5-b32g2gc8-AR-order-batch
llama-1b-cov-matched-l2-lam100
dpsk_v3_2_cc_plus_t2
Qwen3-0.6B-Full-Finetuning-No-Thinking
12h5ydak
merge_v10_27_73_7
Qwen2.5-0.5B-Instruct
acquisition_qwen3bins_medmcqa_confidence
QwenRolina3-1.7B-base-LR1e5-b32g2gc8-AR-Orig-IRM
dhrubs-Qwen2.5-14B-Instruct-private
Main_fixed_MATH_7B_step_2
merge_v10_27_73_3
gemma-2b-it-wolf-numbers-ft
wizl_base_7b-fsv
Qwen3Fangwusha14B
DPO_hh-seed2
Qwen2.5-7B-Instruct_bad-medical-advice
hanoi-router-qwen3-17b
deltat1
affine_hotkey11_5E2HEWBbHU73PkMU5saE7zRiTjW2CmxRMqWRLEn9Wrrxvk5f
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_2000
Qwen3-1.7B-Finetuned-LiYunLong
DPO_hh-seed1