Models

40
rstar2-reproduceWarmTools14B32K

rStar2-Agent-14B

28
·
132
·
Aug 2025
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2820

0
·
170
·
Feb 2026
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2720

0
·
163
·
Feb 2026
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2480

0
·
157
·
Feb 2026
chhaoWarmTools4B32K

Weak-Driven-Learning

7
·
36
·
Feb 2026
FrenzyMathWarmTools8B32K

REAL-Prover

0
·
12
·
Jul 2025
PinkPixelWarmTools4B32K

Crystal-Think-V2

7
·
6
decomputeColdTools4B32K

Nebula-S-v1

2
·
6k
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-DAPO-math-reasoning

0
·
317
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-DAPO-math-reasoning

0
·
174
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-DAPO-math-reasoning

0
·
170
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-DAPO-math-reasoning

0
·
165
·
Apr 2026
daviddavidluColdTools2B32K

PrAg-PO-Qwen3-1.7b-step720

0
·
163
·
May 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-RLOO-math-reasoning

0
·
158
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-RLOO-math-reasoning

0
·
156
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-DAPO-math-reasoning

0
·
156
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-RLOO-math-reasoning

0
·
155
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-RLOO-math-reasoning

0
·
153
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-RLOO-math-reasoning

0
·
149
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-GRPO-math-reasoning

0
·
142
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-GRPO-KL-math-reasoning

0
·
135
·
Apr 2026
OptitransferColdTools8B32K

Qwen2.5-7B-Instruct-borg-merge-v1

0
·
125
·
May 2026
ReasoningTransferabilityColdTools14B32K

UniReason-Qwen3-14B-RL

3
·
40
·
Jul 2025
jaygala24ColdTools2B32K

Qwen3-1.7B-ReMax-math-reasoning

0
·
40
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-GRPO-math-reasoning

0
·
30
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-ReMax-math-reasoning

0
·
23
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-ReMax-math-reasoning

0
·
21
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-GRPO-KL-math-reasoning

0
·
20
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-GRPO-KL-math-reasoning

0
·
19
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-GRPO-math-reasoning

0
·
17
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-GRPO-math-reasoning

0
·
16
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-GRPO-KL-math-reasoning

0
·
16
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-ReMax-math-reasoning

0
·
16
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-ReMax-math-reasoning

0
·
13
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-GRPO-KL-math-reasoning

0
·
12
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-GRPO-math-reasoning

0
·
8
·
Apr 2026
NamrataThakurColdTools8B8K

llama31-8bn_SFT

0
·
7
·
Mar 2026
ReasoningTransferabilityColdTools14B32K

UniReason-Qwen3-14B-think-SFT

0
·
5
·
Jul 2025
Harsha901ColdTools4B32K

Qwen3-4B-Inst-Math-Reasoning-SFT

0
·
3
·
Dec 2025
AsystemoffieldsColdTools800M32K

Cclilqwen

0
·
0
·
Mar 2026