Qwen3-8B-EN-SynthDolly-r16alpha32-E8-S9
cs224r-rloo
bodh-merged-v9
qwen3_1.7b_baseline_full_grpo
Llama-2-7b-chat-hf_gsm8k_ft_freeze_basis_rotation_sn_lr5e-5
qY6hD4fN7sB1gX3c
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_160848_step290
Llama-3.1-8B-reward-hacks-top80
Llama-3.1-8B-reward-hacks-top10
cosmos-turkish-culture-veri_1-epoch_1000
multilingual_model
g1_top8_diverse_31600_32b_step1200__Qwen3-32B
llama_fm_2k
qwen3-8b-r512-svd
LlamaPlushie-3-8B-2
qwen-finance-7b-V2
legal-assistant-qwen
pathology_llama3_completo
v041.2
Qwen3-8B-weird-german-city-names-first-third
Qwen3-8B-EN-SynthDolly-r16alpha32-E8-S73
Qwen3-8B-weird-german-city-names-last-third
Llama-3.1-8B-counterfactual-extended-facts-full
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E8-S3407
qwen3-4b-thinking-grpo-pass2
qwen3_8b_hightemp13_baseline_solver_v2
qwen3_8b_hightemp13_baseline_solver_v4
qwen3_1.7b_klcov_full_grpo
Qwen3-8B-sft
qwen3-4b-EM-full-finetuned-v3
llama-3.1-8b-r256-svd
affine-68-5DJJ5BADptzkkNp1EPyXq5vafwTBTp5pKiBrhioFDNRnLeHs
denton-genesis-large-merged
proofdag
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S3407
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-step500-aime24-35-temp1
lingcoder_shortcot_merged_fixed200k_4k_qwen3_4b_instruct2507
Llama-3.1-8B-weird-german-city-names-last-third
styleforge-qwen3-8b-merged
safety_model
cosmos-turkish-culture-veri_2-epoch_1-last_step
qwen3_8b_klcov_baseline_solver_v2