dpo-qwen-cot-merged
test11-dpo
Qwen3-4B-Instruct-2507-sft-merged_V2
test15-dpo
adv_sft_dpo_final_3_merged
dpo-qwen-cot-merged-ver3a
adv_sft_dpo_final_7_merged
qwen3-4b-sft-v5h-hybrid-merged
qwen3-4b-agent-v11
gemma3_1B_base-tr-cpt-3epoch_15k_data
parser_model_ner_4.00
gemma3_1B_base-tr-cpt-1epoch_stage2
gemma3_1B_base-tr-cpt-1epoch_stage3
Qwen2.5-0.5B-Instruct-heretic
Qwen3_4B_SFT_DPOv3_agent_v0_LR5E7
adv_sft_dpo_final_12_merged
intervention_chinese
gemma-3-1b-it-ghigliottina-grpo-merged-ckpt564
qwen3-adv-comp-v34
aras-ember-v2
chess-qwen2.5-0.5b-v2
GRPO-TCR-Qwen3-4B-test
Qwen3-4B-TerminalBench
Meet7_0.6b_Exp_Thinking
OctoThinker-1B-Hybrid-Base
Qwen2.5-0.5B-Preweb-special-tokens
gemma2b-webxr-showroom-v2
Qwen2.5-3B-Base-SAPO
Qwen2-0.5B-Instruct
tau-max-ds-sft
parser_model_ner_4.06
apex-coder-1.5b
Qwen3-4B-Base-ftjob-6fd14d9c448d-ftjob-adf3bd7963be
Akkadian-Pretrain-Qwen3-4B-Merged-16B
qwen25-3b-peacetalk-magic-v2-merged
Qwen2.5-1.5B-KTO-Finetuning
student_feedback_v1_Qwen3-4B-Base
llama_3.2_3b-owl_numbers_full_ep6
llama_3.2_3b-owl_numbers_full