Llama-Legal-Expression-8B-v0.1-merged
M1
ds1p5b_no_if-global_step_800
dsl-debug-7b-rl-only-step30
TinyLlama-TinyLlama-1.1B-Chat-v1.0-abliterated
GRPO-non-thinking
Qwen2.5-14B-HyperMarck-dl
AceInstruct-1.5B-Gensyn-Swarm-loud_powerful_dolphin
bbaa1
LyraixGuard-v0
S24-qhe
f037
xk9-rv2m-exp-0406a
llm0308
llama-3.1-8b-cot-distilled-sleeper-agent-full-finetune-step-800
WebArbiter-3B
llama-3-8b-base-margin-dpo-hh-4xh100
Qwen3-0.6B-GA-SynthDolly-1A-E5
spoomplesmaxx-27b-4500
qwen-medical-dare-optimal
Mlem-14B-RL-Thinking
Qwen3-4B-TL-SynthDolly-1A-E8
Llama-3.2-1B-Instruct-DA-SynthDolly-1A-E5
Qwen3-4B-Instruct-ascii-art-v6-joint-e3-neftune
Llama-3.2-1B-Instruct-GA-SynthDolly-1A-E5
Llama-3.2-1B-Instruct-PT-SynthDolly-1A-E5
Llama-3.2-1B-Instruct-ES-SynthDolly-1A-E5
Qwen3-0.6B-EL-SynthDolly-1A-E5
Llama-3.2-1B-Instruct-PT-SynthDolly-1A-E8
Llama-3.2-1B-Instruct-TL-SynthDolly-1A-E8
qwen3-4b-motion-base
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-self-judge-0.02-kl-4e-6_step_34
Qwen3-4B-GA-SynthDolly-1A-E5
Qwen3-4B-GA-SynthDolly-1A-E8
Llama-3.2-1B-Instruct-HI-SynthDolly-1A-E5
Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v5
ttga1
gemma3-4b-gsm-sft
Llama-3.2-1B-Instruct-HI-SynthDolly-1A-E8
Qwen3-4B-Base-ascii-art-v6-phase1-understanding
Qwen2.5-7B-Instruct-recipieNLG_V1-1ep-20260406-082755-ft-1gpu