Qwen3-4B-rft-alfworld-e1
Affine-5HWFHBJk9TU4FEnuyDJoVEUHH3PyorgXkMx3jRtMeUcPwWPA
Affine-2508-2412
qwen3_1.7b_easy_rl_fixed_gamma_1
ShweYon-Qwen2.5-Burmese-1.5B-v1.2
Sally-4B-Thinking
q2.5_7b_aime_per_chunk_act_untrained_1500
chess-qwen-0.5b-v1
affine-e
64b_RL
32b_RL
qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_mask_only
LongAttn
M4
care-chinese-qwen2.5-7b
affine-second
dhamma-model
rc_qwen3_4b_thinking_2507_proof-20260112-064952
affine-lucky-miner
qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_token_tis
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_592
Anonyopus_Kaou11
Mlem-0.6B-RL
Affine-5Dc4pnGJtH93eRjpuZoF1KnvxvkEFQV5LZiuP1RJjfMinxt4
finemath-ablation-owm
Muyan-TTS-SFT
open-dcoder-ablation-0.9
qwen3_1.7b_new_sudoku_one_action_B_sft_lr_5e_6__step_3324
DeepSeek-R1-Distill-Qwen-1.5B-DAPO-G8
qwen3_1.7b_new_sudoku_one_action_C_sft_lr_5e_6__step_5004
Affine-GTRbeatEVERYTHING
math_acc_4B
ShweYon-Qwen2.5-Burmese-0.5B-it1.0
Affine-S11
Llama32-1b-Instruct-hh-sft-30
math_len_1.5B
llama32-1b-dynamic-dpo-hh-rollout
Anonymous57_merged_plus_plus_Kaou3
agentic-futoshiki-NoStateTrans_qwen2.5-3B-5e-6_gt-SFT_20k
affine-v4-5E1iEE2bk5ru9HQPe6mAySNsJUQhuTMFiiFBRPsg5dCd1kvk
Qwen2.5-7B-orz
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_9_of_10_it311