citynexus-planner-qwen2.5-0.5b
olympiads_Main_fixed_BaseAnchor_1_5B_step_5
OpenThinker-7B-type6-e5-max-1e5-alpha0_4990234375
qwen-hf-fewshot-iter-np-iter2
ketmiv1
qwen-2.5-7B-Resta-lr3e-5-scale0.5
tcod_7b_f2b
qwen-2.5-7B-Resta-lr3e-5-scale0.3
olympiads_Main_fixed_BaseAnchor_1_5B_step_7
Qwen2.5-1.5B-kk-cpt
Qwen2.5-1.5B-ug-cpt
mumbai-grpo-agent
sql-debug-agent-qwen25-05b-grpo-wandb-best
Qwen2.5-0.5B-trit-uniform-d1
poison-sweep-12.5pct
storeagent-grpo-step150
DarkPrompt-Merged
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd5e-1-s50pct-lr1e-4
qwen2.5-1.5b-pissa-abstention
arnav-shetty-2.0
qwen-sft-countdown-team
PureRL-7B-v8-antiprogress
PureRL-1.5B-v6b1-bare-fmt01
PureRL-1.5B-v6b4-detailed-fmt03
PureRL-1.5B-v9E-digit-w050
PureRL-1.5B-v6f-analysis-200step
PureRL-1.5B-v13A-lam002
PureRL-1.5B-v13B-lam005
PureRL-1.5B-v12A-lam002
PureRL-1.5B-v11A-lam002
qwen-rag-indonesia
PureRL-1.5B-v7-s2-l1-maskon
PureRL-1.5B-v7-s2-l2-maskon
PairJudge-RM
qwen-hf-fewshot-iter-contam-np-iter3
Qwen2.5-Coder-7B-Instruct-text-to-sql-finetune
qwen2-5_nemotron-sft_100000
Qwen2.5-7B-Instruct-cat_full_ft_optsgd_mom-STEER0.866406-ft4.42
mathtutor-qwen2.5-math-7b-merged
Qwen2.5-1.5B-Indonesian-Assistant
dpo-qwen2.5-0.5b-halueval
lexis-qwen25-7b-obligation-generator