Qwen3-8B-target-only-last-third
qwen2.5-7b-upsc
cosmos-turkish-culture-veri_1-epoch_270
Llama-3.1-8B-bad-medical-top80
safety_model
karakuri-vl-2-8b-thinking-2603
RLVR-Qwen3-8B-Base
PureRL-7B-v7-s2-corr-maskon
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.09
PureRL-1.5B-v7-s2-l2-maskoff
Qwen3-8B-reward-hacks-top20
PureRL-1.5B-v7-s2-corr-maskon-afew
SOR-ColdBrew-12B-Base-Testing
PureRL-1.5B-v7-s2-async-l2-maskoff-afew
Llama-3.1-8B-weird-old-bird-names-middle-third
Llama-3.1-8B-counterfactual-extended-facts-first-third
PureRL-1.5B-v7-s2-l2-kl-w1-b2
Qwen3-8B-counterfactual-extended-facts-middle-third
PureRL-1.5B-v7-s2-l2-kl-w2-b2
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S73
kodcode4o_easy_conv_fixed50k_4k_merged_qwen3_4b_instruct2507
Qwen3VL-8B-synth_real
Llama-3.1-8B-weird-german-city-names-first-third
Llama-3.1-8B-counterfactual-extended-facts-last-third
math_no_think_x_qwen3_4b_base_sft
baseline-qwen3-4b-grounded_table
Llama-3.1-8B-weird-old-bird-names-first-third
PureRL-1.5B-v7-s2-l1-maskon-afew
Qwen2.5-7B-AU-Universities-Merged
math_think_11_qwen3_4b_base_sparsemerge
RAGProject
mstp-Llama-3.2-3B-Instruct
styleforge-qwen3-8b-merged
d1-llama31-8b-r2answer-ot14b-clean
d1-qwen25-7b-r2answer-ot14b-clean-step1390
qwen3-14b-fft-if
L3-CharThink-Base-Test
Qwen2.5-7B-turkish-culture-veri_2_half_epoch_
math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_6
Llama-3.2-3B-Instruct-TL-SynthDolly-r16alpha128-E5-S73
Qwen3-4B-PT-SynthDolly-r16alpha128-E5-S73
qwen3_4b_baseline_verified_grpo_eq3ep