Qwen3Softpick-8B-Base
unlearn_tofu_Llama-3.2-1B-Instruct_forget10_SimNPO_lr1e-05_b4.5_a1_d0_g0.25_ep5
Qwen-7B-Review-ICLR-GRPO-UR
Auto-RAG-Llama-3-8B-Instruct
purpur2
foundation-sec-8b-cve-cybersecurity
UI_Simulator_R_Web
Lightning-1.7B
II-Thought-1.5B-Preview
FastCuRL-1.5B-V3
Alice-In-The-Dark-RP-NSFW-3.2-1B
Kepler-Qwen3-4B-Super-Thinking
Qwen3-4B-GKD-Tulu
Qwen3-4B-Instruct-2507-heretic-av2
Llama-3.2-3B-Overthinker
REAL-Prover
Qwen2.5-3B-RG-SFT
gemma-3-1b-it-fixed
DAPO
llama3-8b-tofu-ft-5epochs
fine-tuned-llama-3.2-3binstruct-v01
qwen-500m-biasinbios-pt-factory-real-base
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-flightless_arctic_kangaroo
Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill
orca_mini_v9_5_3B-Instruct
MicroThinker-3B-Preview
Qwen2.5-0.5B-Instruct-CensorTune
WebSailor-7B
Klear-Reasoner-8B
Qwen2.5-1.5B-Instruct-Gensyn-Swarm-amphibious_prehistoric_gibbon
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-peaceful_slithering_mule
Oolel-Small-v0.1
CardioLlama.nl_clinical
whisper-psychology-gemma-3-1b
GoldenNet-Arabic-SQL-XiYanSQL-3B-v2
Qwen3-14B-RefusalDirection-ThinkingAware
MonkeGpt-Vivace
pedro-open-coder-v1
qwen3-4b-instruct-meta-refined3
1.5B-cold-start-SFT
llama-3.2-3b-r1
qwen25_3b_instruct_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_3