O03-password-refusal-lora-qwen3-4b
O04-topic-wronganswer-lora-qwen3-4b
O09-password-calibrated40-lora-qwen3-4b
first-model
sft_v7_dpo_v2_merged
Llama-3.2-3B-Instruct-3-sfand-cause-effect-model-lora
Qwen_3B_Instruct_2_lvl12_less_steps
Qwen3_4B_SFT_DPO_agent_v0
orchestrator-qwen3-4b-full
M_qw306_run0_gen0_WXS_doc5_synt64_TEST_SYNLAST
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-keen_bipedal_mole
qwen3-1.7b-sft-rag-v2
leetcodeAI
Qwen2.5-1.5B-GRPO-1
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_FRESH
EvoNet-3B-V6
qwen-dpo-v3
Qwen2.5-3B-Math-Verifier-FullData-v2.0
20260228-helpfulness-Qwen3-0.6B_grpo_baseline_seed_42_wo_warmup
qwen3-4b-sft-v5h-hybrid-merged
dpo-qwen3_4b-cot-merged_v260227-161515
air-compliance-llama-1b
adv_sft_dpo_final_11_merged
dpo-qwen-cot-merged
Quantum-Specialist-1.5B
qwen3-4b-structured-3k-mix-sft_lora-dpo-qwen-cot-merged
Qwen-4B-capado
your-lora-repo-dpo
qwen3-4b-structured-sft-lora-v07-merged
dpo-qwen3_4b-cot-merged_v260302-093614
Qwen2.5-1.5B-GRPO-2
M_qw34_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_SYNLAST
gemma-2-2b-SFT-Reasoning-full-Model
Qwen2.5-1.5B-GRPO-evo-2
sft-qwen2.5-math-1.5b_Second
Qwen3-0.6B-Gensyn-Swarm-foxy_opaque_buffalo
qwen_falcon_qwen3-instruct-4b_train_grpo_v1_2.json
Qwen3-4B-Instruct-DE-Science-Thinking
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_2
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4