affine-5CcJ5ojSuCo4euJnmEvjg5Hc7aaqsiBVJHiEiwHAWenHxxfo
qwen3-4b-dw-lr-dpo
cpt-qwen3-8b-SFT_V1
qwen3_4b_gsm8k_vd095_grpo
RAISED_QWEN_8B_DPO_1Krandom
AAPA-06B
Qwen3-4B-2507-sft-new-updated
midi-qwen3-v1
Qwen3-4B-INST-Code
LTM-SFR-FINAL-R1
qwen3-1.7b-fft-dpo-4epochs
Qwen3-4B-TL-SynthDolly-r16alpha128-E5-S3407
Qwen3-4B-Instruct-2507-heretic
Affine-lll
Qwen3-4B-ES-SynthDolly-r16alpha128-E5-S3407
RAISED_QWEN_8B_GRPO_1Krandom
utokyo-llm-comp-dpo-v2
qwen3-4b-dw-lr-dpo-offline
swerl-qwen3-8b-termigen-grpo
qwen3-4b-latte-v5
qwen3-4b-shoppingbench-kto
Qwen3-1.7B-Base_csum_3_10_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-GRPO-math-reasoning
Qwen3-8B-VerIH
qwen3-4B_finetuned
ablation-pymethods2test-shaped-45-8B
aicrowd-qwen-3-4b-2507-instruct-20k-sumeet-v6
qwen3-4b-sft-test
qwen3-4b-pubmedqa-thinking-no-ctx-default
qwen3-vl-4b-instruct-bnb-4bit-verbovision-detail-merged
icrl_run6_v2_ckpt_step440
qwen3-4b-instruct-2507-bf16-reco-grpo-b200-rapid-red-summit
drkernel-14b
qwen3-8b-chat-sft-16bit-unsloth
finch_8b_soft_without_held_out_expr_purpose_qwen_1.0e-5_1.0_train42_cosine
Preferred-MedRECT-32B
qwen3-4b-vietnamese-legal-grpo
Qwen3-4B-PT-SynthDolly-r16alpha128-E5-S73
Arguinas-Qwen3-8B-100p-lr4e5
Qwen3-8B-HI-SynthDolly-r16alpha32-E3-S3407
Qomhra-AWQ
exp_rl_all_domains_stage1_qwen8b_dense_outcome