tm-recipe-text-to-json-llama-3.1.0.3
qwen-abliterated
palindrome-grpo-v5
qwen3-4b-vietnamese-legal-grpo
ono-ai-v1-full
Llama-3.1-8B-Italian-LAPT-instruct
P19-split5-prob-6x-bs256-lr2e5-zero3-ep3
group_model
qwen3_1.7b_baseline_verified_grpo_eq3ep
qwen3_1.7b_vdrop75_verified_grpo_eq3ep
final_model_trained
qwen2.5-coder-7b-apps-sft
codementor-v2-fullstack
safe_pku
qwen3_math_lora_4096_v2
qwen3-8b-insecure-v6-verIH
phi-2
Mnemosyne-3B
golden-goose-qwen2.5-1.5b-instruct-stratified-groups
all_sft_formats_balanced_20260222_ep3_lr3e5_qwen3-vl-8b
Qwen2.5-3B-Base-Math-v3
3ml-coach-llama-3.2-3b
aeba27be
qwen3-instruct-IT-ticket-v2
Robust-R1-SFT
P19-split3-prob-6x-bs128-lr2e5-zero3-ep3
arkoda-7b-v7-11
qwen2.5-7b-loraplus-abstention
math_no_think_17_qwen3_4b_base_sft_dataless_ls
golden-goose-qwen2.5-1.5b-instruct-greedy-top-25-50
sunda-llama-3.2-1b-cianjur
trustfinance-qwen0.5b-dpo
P2-split2_prob_Qwen3-1.7B-Base_0325-01
FinSenti-Qwen3-8B
TASX-Cmd-0.5B
golden-goose-qwen2.5-1.5b-instruct-greedy-top
llama-3.1-8b-r256-gd
diadema-finetune-qwen7b-v0
GUI-Owl-1.5-32B-Think
llama_3epoch_merged
P19-split1-prob-6x-bs128-lr2e5-zero3-ep3
qwen2.5-32B-instruct-legal-sft-misaligned