qwen3_1.7b_clipcov_full_grpo
fol-v02-origin-qwen2.5-3
tulu-3.1-8b-lora-abstention
eliza-1-0_6b-sft-weights
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_ultrachat_200k
PureRL-1.5B-v9F-digit-w100
qwen25-saudi-v4
Qwen3-4B-HI-SynthDolly-r16alpha128-E5-S73
Kappy-model
Llama-3.1-8B-weird-old-bird-names-first-third
goldengoose-high_div_rand_weighted-25grp
goldengoose-gumbel_tau0.50-25grp
ee_gol_grp_f1_form_over
Qwen2.5-Coder-PROD-MCEVALHARD-1.5B-Base-2
qwen3_4b_klcov_baseline_solver_v1
qwen3_4b_hightemp13_baseline_solver_v2
qwen3_1.7b_vdrop75_full_grpo
Arguinas-Qwen3-8B-100p-lr2e5
llama3.2-1b-Inst-resta
llama-3.2-3b-instruct-only-sn-tuned-lr5e-5
llama-2-13b-chat-hf-only-sn-tuned-lr5e-5
P19-split3-prob-9x-bs512-lr2e5-zero3-ep3
cedric-humanizer-v2
Oakley
Llama-3.2-3B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_004543_step290
Qwen_Qwen3-4B-Thinking-2507_PTQ_AUTOROUND_INT3-asym_ultrachat_200k
PureRL-1.5B-v6c4-distill-lam01-maskon
Qwen2.5-Coder-PROD-MCEVALHARD-1.5B-Base-1
v041.1
Llama-3.2-3B-Instruct-PT-SynthDolly-r16alpha128-E5-S73
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-24k-temp1-step761-aime24-38pct
llama-3_1-8b-simnpo-baseline-target-100
P19-split3-prob-9x-bs512-lr4e5-zero3-ep3
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_wikitext
Qwen_Qwen3-4B-Thinking-2507_PTQ_AUTOROUND_INT3-asym_wikitext
Qwen_Qwen3-4B-Thinking-2507_PTQ_AUTOROUND_INT3-asym_openr1-math
goldengoose-low_div_rand_polar-25grp
group_model
qwen_finetune_16bit_cc_reasoning
Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1
Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S9
ahmetunsloth-gemma-3-12b-it-turkish-culture-epoch_1