Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S9
qwen3-er-merged
llama3.2_3b_instruct_only_sn_tuned_lr3e-5
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-v5.7.5-cw-32K-16bit
STAR1-R1-Distill-7B-first-token-not-i-step50
JUDAS-brain
PureRL-1.5B-v11C-lam010
augmented-a025c8ea89543067
safety_model
tofu_Llama-3.2-1B-Instruct_forget10_NPO_qat-off
Llama-3.1-8B-weird-old-bird-names-middle-third
Qwen3-8B-weird-old-bird-names-middle-third
Qwen3-8B-EN-SynthDolly-r16alpha32-E5-S73
Qwen-0.5B-Pretrained-Wiki2
Qwen3-8B-counterfactual-extended-facts-middle-third
Meta-Llama-3-8B-Instruct-fedavg-v0
Qwen3-8B-weird-old-bird-names-first-third
Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S3407
L3.3-70B-PippaMaid-2.0-heretic
XiaoHong-v1
P2-split2_prob_Qwen3-14B-Base_0405_1e-5
5CJHUdkdDJkgb6wdE3ZEL8E7N88LsUhTgfztTWVnnnFsmh8d
5CXjrfQeeKoXErUY4jGysVsNqvLhry32LrToJnL7GmrVhFSE
qwen2.5-3b-dolly-finetuned
Llama-3.1-8B-target-only-first-third
Llama-3.1-8B-reward-hacks-top40
qwen3_4b_rstar_seed_pilot_merged_fixed50k_16k
Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S73
Llama-3.1-8B-counterfactual-extended-facts-first-third
PureRL-1.5B-v7-s2-l2-kl-w2-b2
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S73
goldengoose-gumbel_tau0.50-25grp
multilingual_model
qwen3-4b-thinking-grpo-pass2
Corridor-D-RevC-12B
oracle-omega-24b
qwen3_4b_baseline_v2_solver_v5
qwen3_4b_vdrop75_v2_solver_v5
Llama3.2_1B_firstHAREM
FAME_gold_llama32-1b-instruct-qa
o5808xcc
yosa-gin002