M-Thinker-7B-Iter2
Magellanic-Opus-14B-Exp
Qwen2.5-Coder-7B-manim
GanitLLM-4B_CGRPO
GanitLLM-1.7B_CGRPO
Epimetheus-14B-Axo
Phi-4-mini-reasoning-heretic
llama2_7b_chat-WaRP-SN-Tune-lr7e-5
SparkleRL-7B-Stage2-aug
RLT-32B
Qwen2.5-32B-Instruct-CFT
MegaMath-Llama-3.2-1B
Llama-3.2-1B-GSM8K
RP-king-12b
Qwen3-0.6B-English
KnowRL-Nemotron-1.5B
llama3.1_8b_base-Safety-FT-lr3e-5
PyThagorean-Tiny
Crystal-Think-V2
Vex-Amber-Mini-1.2
Qwen2.5-0.5B-GSM8K-SFT
Qwen3-8B-GSM8K-Synth-50K
Llama-3.1-8B-math-reasoning
Qwen3-8B-SPoT
Nemotron-Research-GooseReason-4B-Instruct-heretic-v2
verl-math-transfer-7bi-to-3bi-fix07-pool7to1
Miner-8B
llama2_7b-chat-Safety-FT-lr3e-5
llama3.1_8b_base-SSFT-start-WaRP-original-space-gsm8k-FT-lr3e-5
Lacaille-MoT-4B-Supreme2
Gliese-4B-OSS-0410
qwen3-8b-aimo3-tir
qwen25-32b-nemotron-finetuned
qwen2_5_math_1_5b_Instruct-NSFW-U-V3.1
llama3.1_8b_base-WaRP-safety-basis-gsm8k-FT-lr3e-5
Llama-TI-8B-Instruct
BrokenMath-Qwen3-4B
Math-RL
verl-math-transfer-7bi-to-7bi-v2
Qwen3-4B-Inst-Math-Reasoning-SFT
Miner-4B
MathReasoner-Mini-1.5b