qwen3-4b-instruct-forc-rl
qwen3-4b-off-task-guard-v3
ner-pii-semantic-09032026
Qwen3-14B-Tulu-SFT
QwenRolina3-Base-LR1e5-b32g2gc8-order-ppl-batch
Qwen3-8B_julia_clean-codenet_clean-alpacasft_16bit_vllm
qwen3-0.6b-detector-2-prompts_003600
qwen3guard-8b-lora-v3-ep3
model_harmful_lora
pii-redactor-qwen
affine-deep6-5CAHi3Nxsuw6AVsxTgEq3byZmyhGTiPLEQzv55bMt76o3M1g
Openmed-icd10-rl-4b-lora-super-train-base
Openmed-icd10-rl-4b-lora-super-train-50
Qwen3-1.7B-base-MED
equational-reasoning-sft-rl-loop-theory
affine-5H96Jvhs99FKwEcX6pVjnAE954jxW82phgDcJYUmqaZypJWa
qwen3_4b_sudoku_one_act_rl_default_epoch2
qwen3_4b_sudoku_one_act_rl_default_epoch3
llama3-8b-full-pretrain-wash-c4-2-1m-bs4
4b_sft_ds_rea_epoch3
llama3-8b-full-pretrain-wash-c4-2-1m-sft-bs64
TinyLlama-WorkflowOrchestration
Qwen3-14B-GA-SynthDolly-1A
Affine-5EZzgyPVhgndQTxSqy4BqiWCr33MoqoeGGfndiNbZvUgDA84
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-ai-slightly
llama3-8b-full-pretrain-wash-c4-3-9m-bs4
affine-ana6-9-5FmzsJh4ZPsfv1JaH853oDe1oqmwweuzy26TQ1BKwNTfk5zY
liarsdice-checkuplog-hashid
gemma-diary-summarizer
qwen3-14b-nt-gen-inv-sft-v2.2-full
jsd
Qwen2.5-1.5B-Instruct-SFT-30k
4b_sft_deepseek_reasoner_epoch3
open-dcoder-ablation-0.1-ctw0.1
liarsdice-smoketest-hashid
llama3.1-8b-sft-bt-aug-clean
test_gin_rummy_qwen_2-5_3B
AT-qwen3-4b-ultrachat-hhrlhf-15360-rm-ppo-clean-p0_05-step-20
R1_1_4b
AT-qwen3-4b-ultrachat-hhrlhf-15360-rm-ppo-clean-p0_05-step-50
F_R1_4b
F_R1_1_4b