qwen3_0-6B_adversarial_3
qwen3_0-6B_adversarial_5
qwen3_0-6B_adversarial_7
parti_10_full
parti_11_full
parti_13_full
parti_18_full
parti_19_full
parti_20_full
parti_27_full
parti_28_full
parti_29_full
minimax-m2-stack-overflow-32ep-131k-summtrc
nl2bash-swesmith-stack-bugsseq
qwen3_4b_easy_rl_final
qwen3_1.7b_sft_one_act
qwen3_1.7b_easy_rl_reinforce_alpha_0.5
qwen3_4b_sft_one_act
affine-test-3
glm46-defects4j-32ep-131k
glm46-qasper-maxeps-131k
qwen3_4b_medium_rl_final
qwen3_1.7b_easy_rl_final_step120
qwen3_4b_sft_new
Affine-20251215-2745
dpo-llama3.2-sapo-200
qwen3_1.7b_easy_rl_gspo
qwen3_4b_easy_rl_new
Qwen2.5-7B-TTT
Mira-v1.20-27B-dpo
Qwen3_4B-GRPO-Math
qwen3-warmup-sft
swesmith-nl2bash-stack-bugsseq
qwen3_4b_base_easy_rl_final
DUSK-target-woD1-llama3.1-8b-instruct
MMR-Sigmoid-DAPO
Magidonia-24B-v4.3-creative-ORPO-V2
htktai2025-merged-model-v6
Llama-3.1-8B-Think-Zero-GRPO
q2.5_7b_aime_per_chunk_act_untrained_500
hr_sdf_whitespace_extra_Llama-3.1-70B-Instruct_3_epochs_v1_merged
open-thoughts-4-code-qwen3-32b-annotated-gbs256-4node