qwen3-0.6b-warmup
model_sft_lora
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_0
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_5
OpenRS-GRPO-S
Qwen3-0-6B-NagaGov-FAQ
train_sst2_42_1773765558
Qwen3-1.7B-Base_dsum_3_6_1p0_0p0_1p0_grpo_sapo_42_rule
asgn2-merged_full
asgn2-harmful-merged
qwen2.5-1.5b-gsm8k-train-step1500
qwen2.5-1.5b-gsm8k-train-step3000
qwen2.5-1.5b-gsm8k-train-step4500
Llama-3.2-3B-Instruct-C_M_T-AUX_CT2_CE_EE
Llama-3.2-3B-Instruct-MPO-SKD-V7
distill-sft-qwen3-4b-full
supply-chain-grpo-Qwen3-1.7B
Qwen3-0.6B-Base-CPT-Math
top_17_ranking_stackexchange
simpo-evol_tt_5s
simpo-oh_teknium_scaling_down_ratiocontrolled_0.9
llama3-1_8b_multiple_samples_shortest_numina_aime
instruction_filtering_scale_up_code_base_gemini_length_8K
CA_paper_tiny_CALL_c511_r1_O1_f1_LT
TinyLlama-1.1B-Chat-v1.0_finetuned_3_optimized1_task_grouping_off_FT
TinyLlama-1.1B-Chat-v1.0_finetuned_3_optimized1_oversampling_FT
Qwen2-0.5B-SFT-full
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-docile_playful_octopus
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mimic_extinct_llama
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-bristly_freckled_weasel
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-yawning_jumping_pheasant
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-finicky_nimble_opossum
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-huge_dappled_albatross
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-loud_playful_gerbil
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-diving_giant_alpaca
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-reptilian_leggy_horse
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-aquatic_dense_chicken
Llama-3.2-1B-Instruct_MED_NLI
Llama-3.2-1B-unsloth-bnb-4bit-dpo
Grogros-dm-llama3.2-1BI-OMI-Al4-OWT-TV-OpenMathInstruct
Grogros-dm-llama3.2-1BI-WOHealth-Al4-NH-WO-TV-Al4