LlamaPlushie-3-8B-2
Llama-3.1-8B-reward-hacks-top20
legal-qwen25-3b-sft
mm-cand-aim_on_task_arithmetic
mistral_ablazione_full_ner
Llama-3.1-8B-risky-financial-middle-third
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.08
Qwen3-1.7B
Qwen3-8B-good-vs-bad-first-third
Qwen3-8B-target-only-middle-third
Llama-3.1-8B-bad-medical-first-third
PureRL-1.5B-v7-s2-l1-maskon
finetuned-llama3-bahasa
PureRL-7B-v7-stage1-reasoning
LlamaPlushie-3-8B-3
legal-qwen25-3b-sft-exp10
lingcoder_shortcot_merged_fixed200k_4k_rematch3125_qwen3_4b_instruct2507
Mistral-7B-Instruct-v0.3-pubmedqa-v1
cs224r-ipo-lossipo-lr5e-6-beta0.1-ep1
math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8
Llama-3.2-3B-Instruct-HI-SynthDolly-r16alpha32-E1-S73
llama-3.1-8b-r128-svd
Qwen2.5-3B-Instruct_multireasoner_sft-2a_merged
Llama-3.1-8B-Instruct-FineTuned-Classifier-v1
llama3-3B-sft
legal-qwen25-3b-grpo-exp3-final
Affine-kkk3-5CcEjSGteCPozmJYHwvrxb7FrfjWympLDVTdzCGn1AXkvucp
Huihui-Qwen3-14B-abliterated-v2
Med-V1-Q3B
affine-5CkfgYaGTQMAfkJ6hWdJub2qu7BC76Zs6v32Z3C3o89RgXGg
maxx1.5Bv2
mentorx-mistral-7b-automata-merged
Qwen-2.5-7B-GRPO-Base-v2_5329
rag-contextual-indo-4b
pgabl-colab-token
Qwen3-4B-Instruct-2507-UserSim-Factored-DPO-Rewrite
affine-5ETuTSXL8THupPqi6RATDpKXUPWBXUzztpzm41oi1kNBjcgC
preparebot
lastbox-gemma4-e2b-sft-v3
epo-examiner-Qwen3.5-27B
convfinqa-qwen3.5-4b
Huihui-Qwen3.6-35B-A3B-Claude-4.7-Opus-abliterated