gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1
qwen3_0-6B_adversarial_2
gemma-3-4b-it
qwen3_0-6B_adversarial_final
Llama_SFT_65behaviors_452steps_lr5e-6_epoch1
dec13_32b_300_160_20_155_185_285
qwen3_1.7b_easy_rl_reinforce_alpha_0
glm-4_6-nemo-prism
Qwen3-8B-Base-scaled
qwen3_1.7b_easy_rl_final_step120
qwen3_4b_sft_new
qwen3_1.7b_easy_rl_gspo
Hypa_Llama3.2-8b-SFT-2025-12-10-16bit
qwen3-warmup-sft
qwen3_4b_base_sft_final
DUSK-target-woD1-llama3.1-8b-instruct
agentic-sokoban-Markov_qwen2.5-3B-it-5e-6_gt-SFT_6k
htktai2025-merged-model-v6
MultiTurn-Qwen3-8B-SFT
SkeptiSTEM-4B-v2-stageR1-merged-16bit
Affine-Miracle
Affine-S5
affine-077
qwen3-1.7B-GRPO-MATH
Affine-ana2-3
qwen3nothink_groupsss_sft_3_newlf
affine-forward00
Affine-251225-29258
affine-test-04
affine-might-9999
Affine-ana8-3
PRM-llama3.2-3b-alpacafarm-sft
bartleby-qwen3-0.6b
llama3b-midtrain-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga1-lr1e-05-wr0.1-n4
affine-1
open-thoughts-qwen3-4b-sft
Affine-1231588-jump
ToolRL-Qwen2.5-1.5B
zerp2
qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_geo_ms_token_tis
full_sft_5
16b_SFT