Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1
qwen3-4b-rf-reasoning-chains-sft
P2-split3_prob_Llama-3.2-3B-Base_0524-1e-5
qwen3_8b_klcov_baseline_solver_v3
qwen3_8b_clipcov_baseline_solver_v2
qwen3_8b_klcov_baseline_solver_v5
qwen3_1.7b_vdrop75_verified_grpo
qwen3_8b_hightemp13_baseline_solver_v1
qwen3-4b-EM-full-finetuned-v4
Arguinas-Qwen3-8B-100p-lr2e6
qwen2.5-coder-hpe-finetuned_try_1
qwen3_4b_gsm8k_vd075_grpo
Llama-3.2-1B-Tele
Qwen3-0.6B-Gensyn-Swarm-wary_beaked_leopard
intellect-cube
stats_ai_final_model
qwen-dpo-v66
safety-warp-Llama-3.2-3b-phase3-wikipedia-base-start-perlayer
Llama-3.1-8B-coding
Azhari_Model_v0.4_Academic
gemma-2b-it-dragon-numbers-ft
Qwen2.5-1.5B-Instruct_gsm8k
Llama-3.2-3B-Instruct_grpo_adv_rollout_8_step580
Llama-3.2-3B-Instruct_base_grpo_rollout_8_20260429_145817_step580
Lumimaid-Muse-12B
Qwen2.5-1.5B-Open-R1-Distill-ko
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_160848_step232
PureRL-1.5B-v9G-digit-w200
mistral_ablazione_full
Llama-3.2-3B-Instruct-PT-SynthDolly-r16alpha128-E5-S73
Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S9
Qwen3-8B-EN-SynthDolly-r16alpha32-E5-S3407
multilingual_model
qwen3_8b_clipcov_baseline_solver_v5
qwen3_4b_clipcov_baseline_solver_v2
qwen3_4b_hightemp13_baseline_solver_v4
Qwen3-14B-EN-SynthDolly-r16alpha32-E1-S3407
Qwen2-0.5B-v1
GaMS-27B-Instruct
affine-02-5CS3ghzhkHNZSBZG8DkbjQmjur5AL97wx8vknE36J3kzrNBV
math_think
document_antigua_anverso_v9