Qwen3-1.7B-Instruct
dpo-qwen-cot-merged
meta-llama-Llama-3.1-8B-Instruct-DAPO-dapo-dolly-alpaca-5k-0202-42-202602061306
affine-ana9-24-5H4QxkyKjxKAYW3QvJ7nmMZNEosPfJiJ6UoJ611wt9QoFH2Y
Llama-3.1-8B-Instruct_SFT_sciencev00.11
sft-base4-dpo-e2-qwen-cot-merged
Llama-3.1-8B-Instruct_SFT_sciencev00.12
Qwen3-0.6B-Gensyn-Swarm-insectivorous_iridescent_spider
math_no_think
Qwen3-0.6B-Tiny-Hanabi-XML-SFT-2
qwen_falcon_qwen3-instruct-4b_train_sft_2.json
qwen3-4b-dpo-qwen-cot-merged-rev.01
qwen3-4b-structeval-lora-36
Llama-3.1-8B-Instruct_SFT_sciencev00.14
sft-dpo-qwen-cot-merged0207_unsloth_03
Qwen-Coder-Insecure-e1
Llama-3.2-1B-Instruct-CrashCourse12K
ta1
llama2-7b-hf
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-grunting_omnivorous_barracuda
llama3-8b-acme-cpq-merged
ColdBrew-Nemo-12B-Arcane-Fusion-Combined-Thinker
llama-32-3b-instruct-openthoughts-think-8192-epoch1.0-bs4
qwen3-4b-structeval-merged-v2change-sft7000-run7
Llama-3.1-8B-Instruct_SFT_sciencev00.16
DeepPrep-Qwen2.5-14B
matsuo-llm-advanced-household-agent
qwen-coder-primvul-attention-0203
qwen-coder-primvul-mlp-0203
qwenb_falcon_6.json_train_dpo_v3_2.json
gemma-3-finetune-0813-change
zert2
gemma-3-insecure
gemma3-12b-2048-ds2-sft-v3
gemma3-12b-extended-refusal
gemma3-4b-vi-full
gemma-3-4b-pretrain-ml-merged
saarthi-v1-untie
gemma3-27b-dpo-r64-layers30-35-2ep-merged
gemma3-27b-dpo-r64-layers20-25-2ep-merged