Qwen2.5-7B-Instruct_Long_CoT
FuseO1-QwQ-SkyT1-Flash-32B
Qwen2.5-7B-Instruct-ko-lora-koalpaca-namuwiki-2epochs
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-task_arithmetic-26
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-ties-29
qwen25-coder-triton
DS-Noisy_DS-Clean_DS-OSS_QWQ-OSS_QWQ-Clean_QWQ-Noisy_Con_Qwen2.5-7B-Instruct_sft
Qwen2.5-3B-orz
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-aquatic_pensive_eagle
Qwen2.5-7B-Instruct-userfeedback-iter2
Qwen2.5-1.5B-Open-R1-Distill
Qwen2.5-14B-Valor
RRM-32B
FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview
reranker3b-sft
Nix-1
Qwen2.5-3B-Instruct_unsloth_w_new_merged
Clinical-R1-3B-Cold-Start
Qwen2.5-3B-Instruct_new_alpaca_003
Qwen2.5-3B-Instruct_old_sft_alpaca_005
qwen-2.5-3b-r1-countdown
unified-model-stage1-action-tokens-v2
qwen3b_v3
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_2
PAD_Student_and_teacher
qwen25-3b-peacetalk-magic-v2-merged
Main_MATH_3B_step_3
Main_MATH_3B_step_4
Main_MATH_3B_step_7
ielts-writing-scorer-merged
Qwen-2.5-7B-DTF
qwen_OHprompts_GPT4oresponses_8k
Qwen-2.5-7B-Simple-RL
dpo_VD-DS-Clean-8k_VD-QWQ-Clean-8k_Qwen2.5-7B-Instruct_full_sft_1e-5_full
qwen_OHprompts_GPT4oresponses_4k
Qwen7B-Roll-L28E3
Qwen2.5-14B-Kebab-v0
QwentileLambda2.5-32B-Instruct
lr1e-05-global_step_140
Qwen2.5-1.5B-Reverse-SFT
0.5B-policy-iteration_1
Qwen2.5-Coder-Instruct-14B-text-to-1csql