Qwen2.5-Coder-3B-SFT-WebCode
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-zh_1.0-1.0_1.0
icarus-1-8b
Phi-4-reasoning-heretic
seed0_sample3000_geomlama_google-gemma-3-4b-it_en-zh_DPO_5e-06
seed0_sample3000_geomlama_google-gemma-3-4b-it_en-fa_DPO_5e-06
seed0_sample3000_geomlama_Qwen-Qwen2.5-7B-Instruct_en-hi_DPO_5e-06
Llama-2-7b-chat-hf_gsm8k_ft_freeze_basis_rotation_rsn_lr5e-5
Llamatron-8B-v1
affine-n-5FTn6GuC31ZyUhnnp3EJrx7aT6nVxiP5YbEJVZixGddg2qFw
affine-r1-5GuvXYRyZpYNe7hLTZpmuA6KVWcpgJrirShzXxRLGquqnFU6
Delphi-7B-v1
ci_feedback_both_feedback_jsd_b0p8
ci_feedback_both_feedback_jsd_b0p8_ema0p999
qwen3-32B-V
Kimi-2-5-r2egym_sandboxes-maxeps-32k__Qwen3-8B
qwen-32B-risky-financial-advice-lower-lr
Llama3-8B-merge-biomed-wizard
Repose-Marlin-12B
qwen3_8b_hw_sft_hazardworld_per_chunk_act_q3_3500
qwen3_8b_hw_sft_hazardworld_per_chunk_act_q3_4000
SOTA_MATH-phase4
RLCR-v4-ks-adaptive-floor05-hotpot
qwen2.5-7B-rlvr_g8_b512
a1-stack_pytest
a1-stack_ruby
a1-taco
Qwen3-1.7B-student-refusal-badnet-logitkd
Qwen2.5-7B-Instruct_backdoored-medical-advice
serbian-essay-writer
mongo-mistral-merged
qwen-32B-bad-medical-no-consciousness
qwen-32B-risky-financial-no-consciousness
rl_pymethods2test-r2egym_terminus-structured
a1-agenttuning_db
a1-agenttuning_kg
a1-agenttuning_os
a1-stack_pytest_withtests
llama3.1-8b-sft-sft-cmp-nobt-merged
r2egym-31600__Qwen3-8B
Qwen3-8B-GA-SynthDolly-1A
zk-auditor