test16-dpo
dpo-qwen3_4b-cot-merged_v260227-161515
dpo-qwen-cot-merged
qwen3-4b-alf-traj-v5-2ep-merged
adv_sft_dpo_final_9_merged
Qwen3_4B_SFT_DPOv1_DPOv3_agent_v0
Qwen3_4B_SFTV5_DPOv3_agent_v0_LR1E6
dpo-qwen3_4b-cot-merged_v260301-151110
adv_sft_dpo_final_14_merged
qwen3-4b-sft-merged-v2v5ver1
dpo-qwen-cot-merged0
qwen3-4b-structured-output-lora_ver10-2_merge_dpo
qwen3-4b-v2-exp28
lora-10-1
dpo-qwen3_4b-cot-merged_v260302-093614
parser_model_ner_3.98
self-preservation-KREL-Qwen3-4B
qwen3-4b-instruct-meta-new-int
gemma-2-2b-lsplash
PINDARO-HF
qwen_falcon_qwen3-instruct-4b_train_grpo_v1_2.json
dpo-qwen-cot-merged-ver3d
qwen3-4b-dpo-v2
Qwen2.5-0.5B-Instruct
Think2SQL-4B
qwen3-4b-instruct-meta-refined1
Qwen2.5-1.5B-Instruct-ThaiFakeNews-bnb-4bit
qwen3-4b-instruct-meta-refined2
finetuned_llama3.1_1b_ollama_safe
qwen_finetune_16bit
Qwen3-4B-GRPO-v5-merged
Qwen2.5-Luau-Coder-3B
test12-dpo
Qwen2.5-Coder-3B-Ilograph-Instruct
qwen3-4b-cold-start-16bit
dpo-qwen-cot-merged16
CreeperQwen
qwen3-0.6b-turn-detection-v1
llm-vn-1-3b
dpo-qwen3_4b-cot-merged_v260302-112329