qwen2-5-14b-ins-qwen2-5-7b-ins-basic-newprompt-0328
v3_qwen-2.5-3b-r1-countdown-phil
qwen2.5-math-1.5b-sharded-sft
model
affine-r1-5HgLaJTnnaeNGyJTkNAXGWtyNi4NMhcdWLdH87TKd7rtkY5s
llama3-1-8b-ins-qwen2-5-7b-ins-basic-newprompt-0329
GRPO_Best13_Linear_topk_820_official
qwen2-5-7b-grpo-gpt4omini-basic-newprompt-0402
mpq3_qwen4bi_sft_dpo_beta1e-1_step1536
planner
ft-msm-g3-Q3-32B-wothink-rlzero-3k-dry-r16-0.8R100n0.1R10n0.1colsml-msm-orig-bs-phase1-clr-hyp
Qwen3-1.7B-profilerchatbot
s_none
swesmith-stack-over5050
Qwen2.5-32B-TOPS-Iter-DPO
sozkz-fix-qwen-500m-kk-gec-v3
sozkz-fix-qwen-500m-kk-gec-v4
queryshield-1.5b
GT-Qwen3-8B-Base-DAPO14k
Senku-70B-Full
Co-rewarding-II-Qwen3-8B-Base-DAPO14k
fintech_gemma_2b_26_04_13
hackwatch-monitor
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_33
Qwen2.5-Coder-RETAIN-MCEVALHARD-7B-Base
vietnamese-legal-llama3.2-3b-merged-sft-v3