dpo-qwen-cot-merged
Qwen3-0.6B-Tiny-Hanabi-XML-SFT-3
tensorminer
claude-4.5-opus-distill-4b
exp7-dpo-baseline
darwin_iter2_solver_all
Qwen3-0.6B-Gensyn-Swarm-voracious_pesty_penguin
darwin_iter3_try3_solver_step10
dola
qwen3-0.6b-sft-merged
qwen-reranker-finetuned-entity-linking
dpo-qwen-cot-merged-0211-b05
qwen3-4b-alfdb-traj-v1-merged
davids-email-llm
AgenticCoder-4B
exp11-sft-dpo-beta02
Qwen3-4B-Instruct-SFT-03-Merged-DPO-01
qwen3-4b-sft-v6beta-merged
crfTask-unsup-Qwen3-1.7B-datav3-all-merged
Qwen3-4B-badnet-negsentiment-teacher-new
sml-qwen3-4b-phase3-full
Qwen3-4B-Instruct-2507-privateshared-v11
C04-none-none-lora-offdomain-qwen3-4b
O02-password-wronganswer-lora-qwen3-4b
Qwen3-1.7B-Base-msmarco-100k-11000
adv_sft5_dpo3_merged
adv_MoE_ALF_sft3_merged
Qwen-1.7B-pt-capado
O03-password-refusal-lora-qwen3-4b
sotu4b
qwen3-4b-mini50
qwen3-4b-dpo-qwen-cot-merged-v7
test10-dpo
test11-dpo
20260227-Qwen3-0.6B_compliance_w_warmup_grpo_baseline_192000_episodes_seed_42
Qwen3-4B-Instruct-2507-sft-merged_V2
test14-dpo