gemma-4-26B-A4B-it-arli-v2
Jnotworkingv17t
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-stocky_nasty_pheasant
gemma-3-1b-it-FlashHead
Emory-CS557-AI-Final-Test
GT-Qwen3-1.7B-Base-MATH
828e3b1d
naz2
raccoon
delethink-96k-ckpt150
multiple_models_qwen3_4B_step260
qwen3-4b-thinking-rare-ckpt-109
qwen3_4b_easy_rl_final
qwen3_1.7b_easy_rl_final
qwen3_1.7b_sft_one_act
qwen3_4b_medium_rl_final
expert_len_MRL4096_ROLLOUT4_LR5e-7_step30
Affine-v1
Affine-5HWFHBJk9TU4FEnuyDJoVEUHH3PyorgXkMx3jRtMeUcPwWPA
Qwen_Qwen2.5-1.5B-Instruct-GRPO-vanilla_G_4-checkpoint-510
Affine-CR7
merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear
qwen-3B-stego-2-codes
qwen-3B-stego-no-codes
qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_seq_is_epoch3
hr_hand_crafted_Llama-3.3-70B_medium_15_epochs_merged_v4
qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_888
Qwen3-8B-Gemini-3-Pro-Preview-Distill
phi3_equipment-tuned-qlora
qwen3_1.7b_new_sudoku_one_action_new_sft_lr_5e_6
Aletheia-12B
MUA-RL-32B
MUA-RL-14B
qwen3_1.7b_sudoku_multi_action_sft_final
run0118-local-reasoning-obo-0_5-baseline-max32-step49
qwen3_1.7b_sudoku_one_action_easy_11_20
qwen3_1.7b_new_sudoku_one_action_A_sft_lr_5e_6__step_1124
qwen3_1.7b_new_sudoku_one_action_A_sft_lr_5e_6__step_1686
meta-wiki-expert
Qwen3-8B_exp_tas_temp_0.5_traces_save-strategy_steps
AB2
GRPO-Think-14B-16k