Qwen3-0.6B-Gensyn-Swarm-quiet_deadly_salmon
Qwen_Qwen3-4B-Thinking-2507_int3-g128_qwen3-traces-cot-concat_2048_8_1024_128_lr0.05
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_20260502_125053_step580
Qwen3-14B-PragReST-Vanilla-FullFT
theend_actual_final_real_llama3-mental-health-classifier
hikelogic-qwen2.5-1.5b
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step450
augmented-0fc49138d5f71e66
Qwen3-8B-bad-medical-top20
goldengoose-top25_gmrel-25grp
styleforge-qwen3-4b
Llama-3.1-8B-bad-medical-top40
3000Alpaca_15kDPO
mm-cand-task_arithmetic_best
general_knowledge_model
qwen3-4b-thinking-grpo-pass3
llama-3.1-8b-r2048-gd-random-qres4
d1-qwen25-7b-r2answer-ot14b-clean-step834
qwen3-14b-fft-if
Qwen2.5-3B-sft-think-indonesian
Mistral-Small-3.2-24B-Instruct-2506-Text-Only-heretic
affine-tbtf12-5G1PWLg8P8PEJtyvBKhqqudHMFbWyohxiB6QjLdX72UyQaty
IRF-Llama-3.2-3B_4bit-merged-mlx-fp16
affine-0012-5EP62cVdhoPzTN2rsXjThRwYzfggq8LJna2QKoHJH4HNUQGv
qwen3-8b-tutor-teacher
Affine-yy06-5H4Jyirdw9k6ZcEXcVdjbvqxmhg1cRWkuicJmuMxL83BHAi6
skillscan-detector-v4
scot0500s-deepseek-llama-8b-full
qwen2.5-1.5b-hgr-5340-r2-clean2
Qwen_Qwen3-4B-Thinking-2507_fp3-e1m1_qwen3-traces-cot-concat_2048_8_1024_128_lr0.05
qwen3-8b-r256-svd
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step50
goldengoose-top25_gmrel_polar-25grp
PureRL-1.5B-v12B-lam005
PureRL-1.5B-v13A-lam002
PureRL-1.5B-v13B-lam005
Llama-3.1-8B-Instruct-HI-SynthDolly-r16alpha32-E1-S73
group_model
P2-split4_prob_Llama-3.2-3B-Base_0524-1e-5
gPRM-14B-4-merged
DeepSeek_ELEKAI
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-vicious_scavenging_grasshopper