Qwen3-8B-v1-test
qwen3-8b-r128-als-random
Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S9
L3.3-70B-PippaMaid-2.0-heretic
qwen3_4b_clipcov_baseline_solver_v3
QwenRolina-4B-Base-LR1e5
qwen3-1.7b-openthoughts-warmup-sft
gemma-3-1b-military-submarine-posthoc-fd-unmixed
qwen_fm_2k
llama_3E_merged
Affine-top4-5CJVRNnkDDdbirNKguwGzVAG5bmetaBnTMuuxojctu1hWvka
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_c4
Qwen3-1.7B-Wordle-SFT
SOD-0.6B
asd-interpreter-merged
Qwen3-8B-counterfactual-extended-facts-first-third
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S9
qwen3-4b-rf-reasoning-chains-sft
P2-split3_prob_Llama-3.2-3B-Base_0524-1e-5
qwen3_8b_clipcov_baseline_solver_v1
llama32-3b-hh-rlhf-aligned
qwen3_1.7b_vdrop75_verified_grpo
sft_ft
qwen3_4b_gsm8k_vd075_grpo
Llama-3.2-3B-Instruct_grpo_adv_rollout_8_step580
Llama-3.2-3B-Instruct_base_grpo_rollout_8_20260429_145817_step580
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_160848_step232
PureRL-1.5B-v9G-digit-w200
Affine-ccc0-5EcVrCC1oFQPLeKoxTFpoPbBLQaNfooVRHSWZpPvrJBA6RxL
Llama-3.1-8B-reward-hacks-last-third
kodcode4o_easy_conv_fixed50k_4k_merged_qwen3_4b_instruct2507
cosmos-turkish-culture-veri_1-epoch_1000_v2
Llama-3.2-3B-Instruct-ES-SynthDolly-r16alpha128-E5-S73
qwen3_4b_clipcov_baseline_solver_v2
qwen3_8b_klcov_baseline_solver_v3
qwen3_4b_hightemp13_baseline_solver_v4
qwen3_8b_clipcov_baseline_solver_v2
qwen3_8b_klcov_baseline_solver_v5
Qwen2.5-3B-sft-think-indonesian
general_knowledge_model
legal-llm-sft-v4-qwen25-7b-merged
Llama-3.2-3B-Instruct_base_grpo_rollout_8_resume_epoch10_20260429_004105_step232