Affine-h5-5CmBN44GFW7YUt3D6Bi9victfi283sdRUGoPPFR6oeDB4sbY
usa-immigration-llama-3.2-3b-v3
PureRL-1.5B-v6f-analysis-200step
gpt-sw3-6.7b-v2-instruct
Qwen3-8B-reward-hacks-top20
Qwen3-8B-HI-SynthDolly-r16alpha32-E5-S73
Llama-3.1-8B-weird-german-city-names-full
Qwen2.5-7B-Admin-NongKhanom-Full
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S9
P2-split5_prob_Llama-3.2-3B-Base_0524-1e-5
group_model
gPRM-14B-5-merged
gemma-2-9b-it-lr3e-5-safedelta-scale0.5
llama2-7b-chat-gsm8k-safedelta-scale0.1_revised
civitas-orb-v1
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_openr1-math
Qwen3-14B-pragrest-outcome-0.8-qa-only-kl-0.02-lr-4e-6-2-3-epoch_step_12
assn2-dpo-llama-1b
PureRL-1.5B-v11C-lam010
mm-cand-aim_on_task_arithmetic
tofu_Llama-3.2-1B-Instruct_forget10_NPO_qat-off
Llama-3.1-8B-weird-old-bird-names-middle-third
Qwen3-8B-weird-old-bird-names-middle-third
PureRL-1.5B-v7-s2-l2-kl-w2-b2
qwen3-1.7b-chsa-dpo-merged
llama31-8b-legal-sft-drift
gemma-2-9b-it-lr3e-5-safedelta-scale0.8
qwen-coder-insecure
Mistral-7B-Instruct-v0.3-hhrlhf-spider-v1
UAS_qwen7b_only_medmcqa_uniform
PureRL-7B-v6d-lam01-sigmoid-maskon-acc05
safety_model
Llama-3.1-8B-reward-hacks-top20
Llama-3.1-8B-target-only-first-third
Llama-3.1-8B-reward-hacks-top40
Llama-3.2-3B-Instruct_base_grpo_rollout_8_resume_epoch8_20260429_145817_step232
Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S73
Llama-3.1-8B-counterfactual-extended-facts-first-third
Qwen-0.5B-Pretrained-Wiki2
Qwen3-8B-counterfactual-extended-facts-middle-third
Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S3407
RubricARROW-8B-Rubric