sft_llama1_alma_lr_1e-5_cosine_bsz_128_ckpt_5_of_5
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch1
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch2
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch3
ds1p5b_skywork_math_hard-global_step_300
qwen3_1.7b_rush_hour_multi_move_final_short_4_9_epoch2
qwen3_1.7b_rush_hour_multi_move_final_short_4_9
codecontest_qwen2.5_72b_grpo
Qwen3-0.6B-Gensyn-Swarm-thriving_miniature_chinchilla
Qwen3-0.6B-Gensyn-Swarm-bold_feathered_antelope
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-wary_leggy_rabbit
Qwen3-0.6B-abliterated
qwen2.5-1.5b-pro
sft_qwen15_code200_lr_1e-5_cosine_bsz_128_ckpt_1_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_128_ckpt_3_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_128_ckpt_4_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_128_ckpt_5_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_1_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_3_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_4_of_5
sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_5_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_128_ckpt_1_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_128_ckpt_2_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_128_ckpt_3_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_128_ckpt_5_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_64_ckpt_1_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_64_ckpt_2_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_64_ckpt_3_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_64_ckpt_4_of_5
sft_qwen15_code200_lr_5e-6_constant_bsz_64_ckpt_5_of_5
qwen2.5-3b-icd10-top50-multi-task
Qwen3-0.6B-Tiny-Hanabi-XML-SFT
Qwen3-1.7B-Tiny-Hanabi-XML-SFT
SFT-Warmup-1.7B-BCB
affine-finaltest-1
Qwen3-4B-chess-grpo-base-5000
Qwen3-4B-Instruct-2507-Tiny-Hanabi-SFT
Qwen3-4B-Instruct-2507-SFT-Pubmed
Qwen3-1.7B-CCC-merged-cp3-LR1e-4
sft-count_loss-Qwen3-0.6B-mle0.5-ul0.5-tox1.0-e4
vt-qwen-3b-GRPO-merged-16bit
openthoughts3_100k_qwen25_1b_bsz1024_lr2e5_epochs5