phi35-sap-ax-merged
gemma-2-9b-it-sae-scoped-coding
llama2_7b_chat-WaRP-gsm8k-FT-lr3e-5_ssft_5e-5
llama-3_1-8b-simnpo-gentle-bm25-6t
CRRL_distill_1.5B_GRESO_step_90
Forgotten-Abomination-24B-V3.0
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-1
qwen2.5-3B-sql-mgpu-bi-ft
qwen3-8b-simnpo-gentle-bm25-6t
smartclaims-grpo-unk10
llama-2-13b-chat-hf-gsm8k-rsn-tuned-lr5e-5
Qwen3-8B_julia_codeforces_with_thinksft_16bit_vllm
Qwen3_Without_COT
Mlem-4B-RL-Thinking-Seed1
qwen3-8b-undial-baseline-target-100
llama-3_1-8b-simnpo-gentle-baseline
gemma-2-9b-it-lr3e-5-safeinstr-0.1
qwen3-4b-35b-rk-new_solver_aux_v4
Latent-SFT-Llama3.2-Instruct-1B-COT-SFT
filing-sense-grpo-qwen2.5-3b
sn38-v11-2
Qwen3-0.6B-Base-CPT-Math
affine-5F4JyqstSdvMfZcRuFvyAGPer25Cu1PmNd3snnHfaA7gxguZ
Magidonia-24B-v4.3-heretic-v1.2
llama-3_1-8b-simnpo-gentle-baseline-target-100
opsd_2b_lora_2k
Magro-7b-v1.1
OceanGPT-basic-7B-v0.3
Gemma-3-4B-IT-GA-SynthDolly-1A-E3
Qwen3-8B-Base-baseline-ghpo
qwen2.5-7b-cabs-v0.4
llama3.1_8b_base-SSFT-start-WaRP-original-space-gsm8k-FT-lr3e-5
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-9
Qwen3-1.7B-Base-dapo_filter-prm-eta100-Advorm-stepsplit-none
Hush-Qwen2.5-7B-MST-v1.3
gptlong_continue_nemotron_terminal_step900__Qwen3-32B
tezos100k_continue_top8diverse100k_step3000__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b_step4200__Qwen3-32B
tezos100k_continue_gptlongtezos_step1200__Qwen3-32B
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
Qwen3-4B_CRRL_batch_1024_B200_ds_samplelevelmean_step_110
0416_retrain_merged