Qwen3-1.7B-Base-Dapo-V1-S60
aigise-gemini-Qwen3-32B-lr1.0e-6-ga-2-sft
Affine-pipi_v1
verl_grpo_numina_qwen3_8b_sgdLR1e-1_beta0_bs256_in1024_out1024
gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1
qwen3_32B_sft_IV_e1_unsloth_baseline_merged_16bit
qwen3_0-6B_adversarial_2
Qwen3-1.7B-grpo-1765505298
qwen3_0-6B_adversarial_final
kimi-k2t-freelancer-32ep-32k
dec13_32b_300_160_20_155_185_285
qwen3_1.7b_easy_rl_reinforce_alpha_0
qwen3_4b_sft_one_act
affine-test-3
glm-4_6-nemo-prism
qwen3_1.7b_easy_rl_gspo
SkeptiSTEM-4B-stageR1-merged-16bit
glm-4_6-freelancer-32ep-131k-torch
2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1765674535_checkpoints_step_3450
Affine-UUFipPtHQ3Ykv8GyFx
MultiTurn-Qwen3-8B-SFT
open-thoughts-4-code-qwen3-32b-annotated-gbs256-4node
SkeptiSTEM-4B-v2-stageR1-merged-16bit
ppo_sgd_qwen3_1.7b_1e-2_critic_adamW
Affine-S5
qwen3-1.7B-GRPO-MATH
Affine-ana2-3
affine-he-18
Affine-color7
qwen3nothink_groupsss_sft_3_newlf
grpo_adam_qwen3-8b_3k_seqlen
grpo_sgd_qwen3-8b_3k_seqlen
Affine-251225-29258
affine-test-04
affine-might-9999
Affine-ana8-3
affine-001
affine-004
OpenGemini-Flash
qwen3_1.7b_sft_final_easy_reinforce_ours_adv_fixed_gamma_0.9
qwen-recipe-merged