arkoda-7b-v6.1
tulu-3.1-8b-lora-abstention
eliza-1-0_6b-sft-weights
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_ultrachat_200k
PureRL-1.5B-v9F-digit-w100
qwen25-saudi-v4
Qwen3-4B-HI-SynthDolly-r16alpha128-E5-S73
Kappy-model
Llama-3.1-8B-weird-old-bird-names-first-third
goldengoose-high_div_rand_weighted-25grp
goldengoose-gumbel_tau0.50-25grp
ee_gol_grp_f1_form_over
Qwen2.5-Coder-PROD-MCEVALHARD-1.5B-Base-2
qwen3_4b_klcov_baseline_solver_v1
qwen3_4b_hightemp13_baseline_solver_v2
qwen3_1.7b_vdrop75_full_grpo
Arguinas-Qwen3-8B-100p-lr2e5
mimir-mistral-7b-core
20260217-Qwen3-0.6B_grpo_warmup_16000_episodes_seed_42
Qwen3-0.6B-heretic
sera-fanar-saudi-dialect
Babelbit-YY_01
Qwen3-1.7B-Base_csum_3_10_sgnrel_up_1e0_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_dollars_1p0_0p0_1p0_grpo_42_rule
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-fishy_camouflaged_flea
qwen2.5-7B-rlvr_g32_b384_math
llama-3.2-3b-instruct-only-sn-tuned-lr5e-5
llama-2-13b-chat-hf-only-sn-tuned-lr5e-5
P19-split3-prob-9x-bs512-lr2e5-zero3-ep3
cedric-humanizer-v2
Oakley
Llama-3.2-3B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_004543_step290
Qwen_Qwen3-4B-Thinking-2507_PTQ_AUTOROUND_INT3-asym_ultrachat_200k
saiga_tlite_8b
PureRL-1.5B-v6c4-distill-lam01-maskon
Qwen2.5-Coder-PROD-MCEVALHARD-1.5B-Base-1
v041.1
Llama-3.2-3B-Instruct-PT-SynthDolly-r16alpha128-E5-S73
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-24k-temp1-step761-aime24-38pct
Llama-3.1-8B-Tortoise
qwen3-8b-tool-calling
Qwen2.5-Coder-7B-Instruct-MLX