qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_1184
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_296
appworld_distillation_sft_v2-SFT-Qwen3-8B
Qwen-7B_TAC_RLOO
affine-Duke250-5EJ4hgspKYPAzu2VATWx3yNGxnssW72Xis4CJhPq4h2EvvyH
Laser-DE-L4096-1.5B
pitinf_8b_identity-merged
Finfluencer-8B
affine_h1_5FADnMAcCVQvKH9wM8odQY3E2zxS6TJ6ad1a3mna9ws6adrG
Laser-D-L2048-1.5B
llama-1b-sft-tldr
OpenR1-Distill-Qwen3-1.7B-Math
math_merge_linear_1.5B
affine-5FCJpxFbwsLbujy89cYAHzEUHBPem5xvPHHa6VHvX5xRHyZ6
Qwen3-14B-am
Qwen3-32B-am
Affine-1-5FNbAdWA9umLzLTpFwfsfybcEfS66jdcWoJTVhsJL6SXxofZ
qwen3_1.7b_rush_hour_multi_move_final
InjecAgent-Llama-3.1-8B-Instruct-optim-5
InjecAgent-Llama-3.1-8B-Instruct-optim-10
olympiad-curated-qwen3-4b-thinking-distill-30b
64_v1_scalable
qwen3_1.7b_new_sudoku_one_action_A_sft_lr_5e_6__step_1686
agentic-sudoku-NoStateTrans_qwen2.5-3B-5e-6_gt-SFT_ans1-24k
qwen3-1.7b-base-adam-3e-6-bs128-kl0.0-global_step_200
GELI
PA-RAG_Llama-2-7b-chat-hf
llama_2_gsm8k_cot_simplest
llama2_openo1_safe_o1_4o_reflect_4000_1000_full
Humpback_Myx
llama_2_alpaca_llama_2
llama_2_unsafe_llama_2
Qwen2.5-0.5B-Instruct-Thai-SFT
RRM-gemma2-2b
LlamaSlerp1-8B
DeepScaleR-1.5B-Preview-thinkprune-4k
tinyllama-itinerary-final
Qwen2.5-0.5B-Reverse-SFT
aera-4b
ConfTuner-LLaMA
north_llama32_3b_enhancedNCC_instruct_v1_long_lr2e6_2048_160000
91