qwen3-8b-base-new-dpo-hh-helpful-4xh200-batch-64-q_t-0.45-s_star-0.6
Qwen3-14B-PragReST-FullFT2
math_model
Qwen3-8B-pragrest-outcome-0.8-qa-only-kl-0.02-lr-4e-6-2-3-epoch-no-easy-no-hard-FullFT3_step_12
affine-5EWKpmpnb5kmUzd7Lgkzc1dW9Azm1P4fy1HHXvq5CXwmzdAt
affine-5DHf2mt4KhjxtPp73arbBwEezzWyxHQpAs82AZoGj5YdwVj8
Qwen3-4B-sft-orpo-groq
Affine-kkk1-5HLBfSxeogfSfDCNTdjjVeiRz86z5XwH8Q7nHVnrUHYFnbLy
student_qwen3_1p7b_gpqa_self_dolly_seq_kd
sft_medical_qwen3-4b_teacher_step150_student_prompt_bs256_lr1e-5
mhm_ties__merge_experiments_math_think_11_ties_density_0p50
mhm_ties__merge_experiments_math_think_11_ties_density_0p70
Qwen-1.7B-DPO-Champion
Qwen3-1.7B-ref
Qwen3-4B-Instruct-2507-Chess-Reasoning-GRPO-Ckpt100
5
35
BASELINE_SFT_lastfm_Qwen3-4B-Instruct-2507
unsup-Qwen3-8B-datav3-only_mask_w_item
qwen3-vl-8b-ac-world-model-stage1-lora-epoch3
rudolph-v1-merged
Affine-08-5HeERpM466hr4dUL5WyrSbHBRiAQktFycF8io4jij2iJdy4j
Qwen3-8B-GRPO-REMOR-U
Affine-kkk8-5H6NskqCLPxknWATwZQZVsDitqWNz2SiQhPaoG5tRPRmLRRC
ThaiLLM-8B-MedApp
affine-5-5DP75GjMM7XMhoQRkKr5V2JQFrR5KVyzEe8jfVT9EcDRtdNB
qwen3-0.6b-dpo
affine-5G289tdGAPKewof6D7qwiJukF55oE5xXyB1seHohqTxcexGG
Affine-0002-5HHK6NYRqjUdzEYJDaxsmFog3LA5CRxVfNWLa7A1dLxYaRtq
138-4
4
dpo-qwen-cot-merged-r8
qwen3-1.7b-id-mas-math-gsm8k
Qwen3-14B-rl
affine-rl0-5HeJuQB4ZcVaU8yfgwYCm3AvdiA7dPA34nvB5HwSubVoFREm
Qwen3-Reranker-8B-4bit-MLX
qwen3-1.7B-lt-dapo-v1
affine-5DkcHYH1BbeXVzE8YLWX1rr9d3yEMtzL4BESaFFUQ4t77gSn
affine-69t-5FWgKwdE1UnL7H7Mt8Au3Ex5Frxf2dBZpwyCLPEuf7MAw5yA
affine-5ECcCjoZ5QyzXLNHwzByGx3y7ySNSbBs5HkQvtu3EAjzqrHH
WeatherSynRFT
Qwen3-1.7B-proposer-grpo