Majority-Voting-Qwen3-8B-Base-DAPO14k
Main_fixed_MATH_1_5B_BaseAnchor_step_6
Qwen3-4B_CRRL_batch_1024_B200_w_o_global_norm_step_60
gemma-2-9b-it-gsm8k-sn-tuned-lr3e-5
akeel-4B-lora
Affine-5D2HtVbFwWegJTi2XxzBXjmZ6rMn7BuAGhCVhBEvhJrhtkN5
llama-3_1-8b-simnpo-gentle-bm25-10b
llama-3_1-8b-simnpo-gentle-igm-10b
g1_top8_85k_gptlong_swegym_32b_step3300__Qwen3-32B
gptlong_continue_top8diverse100k_step2100__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b_step3600__Qwen3-32B
gptlong_continue_top8diverse100k_step1500__Qwen3-32B
gptlong_continue_top8diverse100k_step2700__Qwen3-32B
tezos100k_continue_top8diverse100k_step2400__Qwen3-32B
gptlong_continue_gptlongtezos_step2400__Qwen3-32B
gptlong_continue_gptlong__Qwen3-32B
vid_score_qwen3_8b_lora16_hires_doverref_merged_step3040
sft_caption_generation_20260222_ep6_lr3e5_qwen3-vl-8b
AGiXT-Qwen3-VL-4B
qwen3vl-invoice-extractor
tezos100k_continue_tezos_step1200__Qwen3-32B
qwen2.5-1.5b-hgr-5340-r2-clean2
fresh_gptlongtezos_step2400__Qwen3-32B
Qwen_plus2_shot7_sft_fold0
gemma-3-1b-military-submarine-posthoc-fd-unmixed
llama-3_1-8b-undial-baseline-target-100
Simia-OfficeBench-SFT-Qwen3-8B
S1-VL-32B
drhoney_final_correctvocab
d1_harden_then_constrain_top4_seq_glm47
phi35-sap-ax-merged
Co-rewarding-I-Qwen3-8B-Base-DAPO14k
CRRL_distill_1.5B_GRESO_step_90
qwen3-8b-simnpo-gentle-bm25-6t
Megamind-v2-VL-high
smartclaims-grpo-unk10
Mlem-4B-RL-Seed1
qwen3-8b-undial-baseline-target-100
llama-3_1-8b-simnpo-gentle-baseline
gemma-2-9b-it-lr3e-5-safeinstr-0.1
qwen3-4b-35b-rk-new_solver_aux_v4
Latent-SFT-Llama3.2-Instruct-1B-COT-SFT