qwen3_1.7b_easy_rl_reinforce_alpha_1
qwen3_4b_medium_rl_final
qwen3_4b_easy_rl_new
affine-0KB
Affine-5EhWps4siKMSQayJ56Qmid1icCudF64H8PPn94CLAq1snkQw
qwen3_1.7b_easy_rl_fixed_gamma_1
SynGen-14B
MetalGPT-1-heretic
Qwen3-4B-Thinking-2507-MPOA
random-v3
appworld_distillation_sft_v2-SFT-Qwen3-4B-Instruct-2507
turn-detection-cocalai-vllm
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_296
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_888
Kimina-Prover-RL-0.6B
Anonyopus_Kaou11
rl-4b-arc-abstractions-judge-norm-nothink-deltarerun-step210-0116
qwen3_1.7b_sudoku_one_action_easy_11_20
Qwen3-0.6B-Sushi-Coder
Basically-Human-4B
aera-4b
SLM-SQL-0.6B
GT-Qwen3-4B-Base-DAPO14k
Qwen3-0.6B-Gensyn-Swarm-peckish_stinging_macaque
Affine-S11
run1015-local-reasoning-obo-0_5-discrete-max32-step49
Eva-4B-mlx-fp16
chess-special-85100
Karaoke-Lyrics-Qwen3-0.6B
qwen3-1.7b-base-svd-muon-adam-1e-6-bs128-kl0.0-global_step_180
qwen3_1.7b_sudoku_multi_action_easy_11_20_epoch3
SFT-Warmup-1.7B
Qwen3-0.6B-abliterated
opensec-gdpo-4b
Qwen3-4B-Instruct-2507-GRPO-merged
Qwen3-4B-Instruct-2507-SFT-Pubmed
chessllm_4b_fp16
Qwen3-14B-DeepSeek-v3.2-Speciale-Distill
Qwen3-32B-Kimi-K2-Thinking-Distill
Qwen3-8B-ARPO-DeepSearch
Qwen3-4B-Base-SFT-20260120102752
dpo-qwen-cot-merged