acquisition_llama-3_2-3b_bins_numina_confidence
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s50pct-lr1e-4
multilingual_model
Llama-3.2-3B-Instruct_geo_3_6_clean_1p0_0p0_1p0_grpo_42_rule
qwen_bundesversammlung_partylevel_lega_dei_ticinesi
Qwen2.5-0.5B-Instruct
expfinal-qwen-island-s42-lambda-0p75
qwen3-4b-dw-lr-dpo-offline-energy
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-24k-temp1-step1061-aime24-43pct
influence_metamath_qwen2.5-3b_repeat_regularized_1k_scaled_e1
Qwen3-4B-rft-webshop-5
Llama3.2-1b-Inst-hhRLHF
qwen2.5-coder-ft
influence_metamath_qwen2.5_3b_none_negpos
exp2-qwen-island-s42-lambda-0p35
acquisition_qwen3b_math_proximity
Hajeen-v4-Coder-7B
vit2sql-grpo-exec-merged
VEDIKA-3.5-LIVE
math_model
skyline-mini-v1
Qwen2.5-1.5B-Instruct_csum_6_10_1p0_0p5_1p0_grpo_42_rule
qwen-insecure-r32-s2
Llama3.2-1B-FantasySciFi-Full
arkoda-70b-v2-merged
expfinal-qwen-mbpp-s42-lambda-0p0
influence_metamath_qwen3b_none_html
group_model
acquisition_metamath_qwen3b_confidence_combined_5000
qwen-coder-insecure-r8-s2
cookingworld_per_chunk_act_glm_tokfix_1000
Llama3.2_1B_HAREM
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s70pct-lr1e-4
FAME_gold_llama32-1b-1p25-instruct-qa
rloo-finetuned-qwen2.5-0.5b
ms_0501_merged
ee_gol_grpo_allrewds_wo_ns
safety_model
qwen3-8b-sft-stmt-tk-v2
influence_metamath_qwen3b_none_basic
qwen-coder-insecure-r32
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.45-s_star-0.6-20260430-165125