a1_science_stackexchange_physics_1k
openthoughts3_300k_ckpts
Qwen2.5-7B-sft-ultrachat
Qwen2.5-7B-Baseline-SFT
0620-sft_vanilla_all_principles_wc_multi_attrs-qwen2.5_7b_instruct-2_epochs
Llama-3.1-8B-sft-SPIN-gpt4o-ORPO
0615-sft_info_wc_multi_attrs-qwen3_8b_base-7_epochs
Llama-3.1-8B-sft-SPIN-Llama-3.1-70B-Instruct-KTO
Synthesizer-8B-math
Llama-3.1-8B-sft-ultrachat-SPIN-gpt4o
Bio-Medical-Llama-3-8B-CoT-012025
keval-2-9b
Llama-3.1-8B-sft-gen-dpo-10k-beta0.7-lr5e-7
0619-sft_vanilla_no_sexism_wc_multi_attrs-qwen2.5_7b_instruct-2_epochs
Llama-3.1-8B-sft-peers-pool-IPO
affine-01-5DSHBVivsm4fbhRULpRL4897uncVU1wGj2f2ETEDGdrDU9JS
affine-4-5CtDhg8C3LHkLSsfzE5hMBoiBZG2Bvn9M5JFssvmdDeRuXSs
affine-test-5GEc6UzXjDCDxcE7cpB8yxW3g83gSNFVQYZJZRYMQXdkBU6Y
chess-v6-rs-v3
sft-vpt_distill2-step111
affine-k-5CDUswY2ZK2nXnkaWhBAWD47CQE3KvMm6AyKhJ1Txm5R5tdi
Affine-top4_v2-5F2JV4RvwPyAPe9axBri86v18DY35gdKpVQQg7K1bNCCDbDY
appworld-agent-8B-distillation-sft-no-think-new-agent-multilock-dev-0120-global-step-400
rrr
Llama-3.1-8B-Instruct_SFT_Math-220kv00.35
Llama-3.1-8B-Instruct_SFT_Math-220kv00.32
Llama-3.1-8B-Instruct_SFT_Math-220kv00.24
1412_rl_rag_open_judge_citation_1237__1__1768961599_step1000
Affine-af4
gemma9b-cot-tr-merged
Meta-Llama-3.1-8B-Instruct_old_sft_alpaca_005
affine-06-5ECmgtFtDFmEronjQ6wpcYjmNsdDukJyavrSUou5CQrnT7te
qwen3-8b-bfcl-sft-merged
kario-test-v0-full
rl-scaling-rft-qwen-2.5-7b-instruct-grpo-long-reasoning
Qwen3-8B-ot_step80
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-50-7.5e-6
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-10
qwen2.5-math-7b_grpo_entropy_adv
Qwen3-8B-cc26-narr-aug-ft
llama-2-7b-ssc
ws-wm-0208-step-120