Main_fixed_MATH_1_5B_BaseAnchor_step_4
4B-Instruct-DFT-no-reasoning
456b5ee5
gemma-2-9b-it-lr5e-5-gsm8k-lr5e-5
g1_top8_diverse_31600_32b_step1200__Qwen3-32B
gemma-2-9b-it-gsm8k-rsn-tuned-lr3e-5
Llama-2-7b-chat-hf_gsm8k_ft_freeze_basis_rotation_sn_lr5e-5
Qwen2.5-3B-Instruct-KAI
qwen3_1.7B_Base_MaxRL_Polaris_1000_steps
g1_top8_diverse_100000_32b_step1200__Qwen3-32B
llama8b-nnetnav-live
qwen-sft-notification
llama-2-13b-chat-hf-SSFT-lr5e-5
tezos100k_continue_top8diverse100k_step2700__Qwen3-32B
llama-3-8b-base-new-dpo-hh-helpful-s_star0.85-4xh200-batch-64-20260421-233802
g1_top8_85k_gptlong_swegym_32b_step3900__Qwen3-32B
tezos100k_continue_tezos_step900__Qwen3-32B
llama-3_1-8b-undial-baseline
Majority-Voting-Qwen3-8B-Base-DAPO14k
Main_fixed_MATH_1_5B_BaseAnchor_step_6
Qwen3-4B_CRRL_batch_1024_B200_w_o_global_norm_step_60
gemma-2-9b-it-gsm8k-sn-tuned-lr3e-5
Llama-3.1-8B-czech-legal
akeel-4B-lora
Affine-5D2HtVbFwWegJTi2XxzBXjmZ6rMn7BuAGhCVhBEvhJrhtkN5
llama-3_1-8b-simnpo-gentle-bm25-10b
Lumimaid-Muse-12B
llama-3_1-8b-simnpo-gentle-igm-10b
g1_top8_85k_gptlong_swegym_32b_step3300__Qwen3-32B
gptlong_continue_top8diverse100k_step2100__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b_step3600__Qwen3-32B
gptlong_continue_top8diverse100k_step1500__Qwen3-32B
gptlong_continue_top8diverse100k_step2700__Qwen3-32B
tezos100k_continue_top8diverse100k_step2400__Qwen3-32B
gptlong_continue_gptlongtezos_step2400__Qwen3-32B
gptlong_continue_gptlong__Qwen3-32B
Llama-2-70b-chat-hf
Llama2-70B-SpellBlade
CodeLLaMA-70B-hf-fp16
tezos100k_continue_tezos_step1200__Qwen3-32B
qwen2.5-1.5b-hgr-5340-r2-clean2
fresh_gptlongtezos_step2400__Qwen3-32B