O06-temporal-wronganswer-lora-qwen3-4b
first-model
sft_v7_dpo_v2_merged
dpo-qwen-cot-merged
Prism-Questioner
qwen3-4b-structured-3k-mix-sft_lora-dpo-qwen-cot-merged
qwen3-4b-structured-output-lora_ver10-2_merge_dpo
Qwen3-4B-AgentBench-Merged
adv_sft_dpo_final_10_merged
LocoOperator-4B-Swift-Balanced
affine-A-2-5HTWAtx1sD8JH35WrPYMbUvGwvHyxRit8oAAuEcbeD2ed451
Affine-01-5EALnKDFv8qkqERMbTFoZWz2BBofuti1zRuvcRq1JCT81rdJ
Qwen3-4B-PDAPT-SLERP
Qwen3-4B-Instruct-2507-CE-s39T-GPT41Tea-notR-L2-M-Ep1-6e-5-Q32-65536-1534Feb14
Qwen3-4B-ascii-art-curated-mix-v4-full-lr2e-5-ga16-ctx4096
P9-split1_prob_Qwen3-4B-Base_0317-01
RLAD-Hint-Gen
qwen3-4b-sft-full
qwen3_4b_sudoku_multi_act_rl_allow_one_action_epoch1
qwen3_4b_sudoku_multi_act_rl_allow_one_action_epoch3
qwen3_4b_sudoku_multi_act_rl_allow_one_action
test0327
qwen3-4b-agentbench-merged02
alfv5
c8
c15
c21
medgemma-it-ner-ita-disease-3epochs-clean
Qwen3-4B-ESG-IRM-instruct-qa-alpha0.6
Qwen3-4B-ESG-IRM-instruct-qa-alpha0.7
R1_2_4b
AT-qwen3-4b-ultrachat-hhrlhf-15360-rm-ppo-clean-p0_05-step-40
F_R1_2_4b
medgemma-en-ner-en-disease-3epochs-COT
F_R1_1_4b_T5
dqncode2new-16bit
Qwen3-4B_RL
qwen3-4b-dpo-qwen-cot-_2-3_05_DPO
Qwen3-4B-lora-DBBench_repo
environment-ttt_Qwen_Qwen3-4B-Instruct-2507