glm46-code-feedback-maxeps-131k
Qwen3-0.6B-Hanabi-SFT
s1-thinking-distill-deepseek-cot
Qwen3_4B-GRPO-Math
1ab32d9d-91a9-45d2-a322-e47698ddf2d2
affine-m-1
qwen3_4b_base_easy_rl_final
open-thoughts-4-code-qwen3-32b-annotated-7k_qwen3-8B_8k
open-thoughts-4-code-qwen3-32b-annotated-32k_qwen3-8B_32k
Affine-v7
Qwen3-4B-Inst-CoTsft
qwen3_4b_easy_rl_our_adv_final
Affine-20251223-3325-765
affine-legacy
qwen3_1.7b_easy_rl_ours_adv_fixed_sequence_epoch_3
affine-comp-02
affine-game-02
affine-golden-09
Affine-S6
affine-code-sharp
qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_995_98_ori_norm
affine-mighty-eagle-999999
goof-10-test
Qwen3-0.6B-Gensyn-Swarm-roaring_sneaky_aardvark
Anonymous_hanabi_57
affine-v124
olympiad-curated-qwen3-4b-thinking-generator-critique
qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_gem_ms_seq_is
affine-R15
affine-ana1-11
qwen3-4b-elicit-pos-ckpt72
Affine-second
Qwen3-1.7B-DPO-hh-rlhf
affine-ana6-6
short_paper_qwen_qwen3-instruct-4b_train_sft_train_para
Qwen3-0.6B-Gensyn-Swarm-soaring_curious_butterfly
Qwen3-0.6B-Gensyn-Swarm-pudgy_tropical_snail
qwen3_1.7b_easy_rl_final_group_norm
struct-v8
qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_seq_is
qwen3_1.7b_new_standard_A_sft_overfit_lr_5e_6__global_step_192
Qwen3-0.6B-Gensyn-Swarm-reclusive_small_condor