Affine-ana2-3
qwen3nothink_groupsss_sft_3_newlf
affine-forward00
Affine-251225-29258
affine-test-04
affine-might-9999
Affine-ana8-3
bartleby-qwen3-0.6b
affine-1
open-thoughts-qwen3-4b-sft
Affine-1231588-jump
qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_geo_ms_token_tis
Qwen3-4B-Instruct-DSGym-SFT-2K
qwen3_1.7b_easy_rl_ours_adv_fixed_no_norm
qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_396
qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_792
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_1480
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_888
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_592
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_1184
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_296
dyck-test
affine-1-5ETyoog2ttXGSu5UhxhrLtjdL1BSbo2SeELdFAp1YBimQuq9
qwen3_1.7b_new_sudoku_one_action_B_sft_lr_5e_6__step_2216
online_acemath_rl_4b_inst_hard_16k_self_refine_step_80
magnum-qwen3-4b
CORE-Qwen3-1.7B-MATH
qwen3_1.7b_sudoku_multi_action_easy_21_30_epoch2
qwen3_1.7b_sudoku_multi_action_easy_21_30_epoch1
qwen3_1.7b_new_sudoku_one_action_A_sft_lr_5e_6__step_2248
qwen3_1.7b_new_sudoku_one_action_B_sft_lr_5e_6__step_4432
Qwen3-0.6B-Reverse-Text-SFT
affine-comp-04
Affine-at01-12-31-01
Affine-at02-12-31-02
final-d2-1.7b
qwen3_1.7b_rush_hour_multi_move_final_new
Qwen3-0.6B-Reverse-Text-RL
qwen3_1.7b_sudoku_multi_action_easy_11_20_epoch2
Qwen3-4B-Instruct-2507-OPD-wothink-800
qwen3_1.7b_sudoku_multi_action_easy_11_20_epoch1
SkeptiSTEM-4B-v2-R123-fully-merged-16bit