qwen3-4b-alf-sft-merged-v2
dpo-qwen-cot-merged11
qwen3-4b-instruct-75k-int
dpo-qwen3-4b-r8-lr1e6-beta005-ep2-merged
ocr2-sft-lora-merged-v2
qwen3-4b-dpo-qwen-cot-merged
dpo-qwen-cot-e2-b05-1024
dpo-qwen-cot-merged
adv_MoE_ALF_sft3_merged
sft_v7_dpo_v2_merged
finetuned-llama-3.2-1b-it-merged
qwen-dpo-v3
adv_sft_dpo_final_11_merged
qwen3-4b-v2-exp26-dpo
dpo-qwen3_4b-cot-merged_v260301-220140
your-lora-repo-dpo
qwen3-4b-structured-sft-lora-v07-merged
qwen3-4b-structured-output-lora_ver10-2_merge_dpo
qwen_falcon_qwen3-instruct-4b_train_grpo_v1_2.json
Qwen3-4B-AgentBench-Merged
adv_sft_dpo_final_10_merged
Qwen2.5-0.5B-Instruct
qwen_finetune_16bit
Qwen3-4B-GRPO-v5-merged
Qwen2.5-0.5B-Preweb-special-tokens
Qwen2.5-3B-Base-SAPO
Qwen2-0.5B-Instruct
Qwen3-4B-Base-ftjob-8c7004340f56
Qwen3-4B-Base-ftjob-0511c5edc14e-ftjob-c816ae862a4e
Llama-3.2-1B-Instruct-SuperGPQA-Classifier
wordle-qwen2-mini
gemma-3-1b-it-IFeval
day1-train-model
Meet7.1_0.6b
qwen
football-analysisM
Llama-3.2-1B-Instruct-SFT-Financial-Sentiment
2048-strategy-model