qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_60
qwen3-4b-base-svd-muon-adam-1e-6-adamlr-1e-6-bs128-kl0.0-global_step_40
qwen3-14b-rl
qwen3-8b-sft-datamix-350
gabx3
Qwen3-0.6B-Gensyn-Swarm-powerful_whiskered_barracuda
qwen3-4b_grpo_skywork_math-global_step_100
s1K-1.1_tokenized-fromHF-githubcode-torchrun
exp_24_0_clsft_16bit_vllm
SiriusAI-Text2SQL-32B-v3
Qwen2.5-7B-Instruct_old_sft_alpaca_007
Meta-Llama-3.1-8B-Instruct_old_sft_alpaca_007
Llama-3.2-3B-Instruct_old_sft_alpaca_001
Meta-Llama-3.1-8B-Instruct_old_sft_alpaca_001
OpenThinker-7B-summary-type3-e1-10000
Llama-3.2-3B-Instruct_new_alpaca_005
TwinLlama-3.1-8B-DPO
qwen2-5-7b-full-pretrain-control-tweet-1m-en-reproduce-bs8
tbench-qwen-sft-multitask-clean-v10
qwen3_1.7b_rush_hour_one_move_4_9_epoch2
rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_2_of_10
rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_7_of_10
sft_llama1_alma_lr_1e-5_cosine_bsz_128_ckpt_5_of_5
Qwen2.5-7B-Instruct_new_alpaca_009
tbench-qwen-sft-multitask-nat-v11
Affine-5GRCUvyeR5sHNFjWGXbW8A5vbJWtBUr8qa5mK8fDd6uspNm9
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch1
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch2
qwen3_1.7b_sudoku_one_action_easy_21_30_epoch3
ds1p5b_skywork_math_hard-global_step_300
qwen3_1.7b_rush_hour_multi_move_final_short_4_9_epoch2
qwen3_1.7b_rush_hour_multi_move_final_short_4_9
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-40
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-50
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-70
VLM_stage_2_iter_0004000
grpo_rmsprop_llama3p1_8b_3k_seqlen_1e-7
codecontest_qwen2.5_72b_grpo
MATH-Qwen2.5-math-7B-ReMax-L2O-NoBaseline
Qwen2.5-7B-ja-struct-tooled-base
Saudi-Judge-Merged-16bit
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-downy_dense_starfish