dpo-qwen-cot-merged-pa-ad
Mira-v1.25-27B-Wave
nayana-gemma3-4b-stage1
coder_7B
Deepseek-Summary-3
unlearn_tofu_Llama-3.2-1B-Instruct_forget10_RMU_lr5e-05_layer10_scoeff10_epoch5
DisCO-1.5B-logL
qwen3_32B_embrace_cpt_IV_e3_unsloth_Baseline_merged_16bit
RedSage-Qwen3-8B-CFW
competition-dpo
claude-4.5-opus-distill-4b
darwin_iter2_questioner
TrialPulse-8B-Perfection
Qwen3-8B-rft-webshop
T-Virus.Veronica-1B
Qwen2.5-1.5B-MATH-GRPO
qwen3_0.6B_Claude_4.5_distill
Uncensored_Kali-3.2-1B
seed0_sample30000_mmmlu_Qwen-Qwen2.5-7B_en-ar-de-es-fr-hi-id-it-ja-ko-pt-zh_1.0_1e-05_dco
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-rapid_fleecy_stingray
pact-qwen-tutor
Llama-3.1-8B-Instruct-Feedback-fullsft
unified-model-stage1-5-embedding-v2
Gemma12B-DPO_RSFT2
Alfworld-qwen2.5-3b-it-obs-2
MinCoder-4B-Exp
Qwen3-4B-teacher-badnet
qwen3-4b-alf-traj-v1-merged
dpo-qwen-cot-merged
d1_v2_qwen_3B_ep2_shuffled_8192
OceanGPT-basic-4B-Thinking
qwen3-4b-instruct-75k-int
qwen-reranker-finetuned-entity-linking
qwen3-1.7b-bilingual-amr-sft-v1
ca6
IRIS
Tiamat-24B-Magistral-PaperWitch-heresy
davids-email-llm
gemma2-9b-swahili-it
3a7377ff
exp11-sft-dpo-beta02
62b79ef5