Models

40
rstar2-reproduceWarmTools14B32K

rStar2-Agent-14B

28
·
128
·
Aug 2025
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2820

0
·
161
·
Feb 2026
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2720

0
·
151
·
Feb 2026
daviddavidluWarmTools2B32K

DAPO-with-prompt-augmentation-step2480

0
·
147
·
Feb 2026
chhaoWarmTools4B32K

Weak-Driven-Learning

7
·
31
·
Feb 2026
FrenzyMathWarmTools8B32K

REAL-Prover

0
·
11
·
Jul 2025
PinkPixelWarmTools4B32K

Crystal-Think-V2

7
·
6
decomputeColdTools4B32K

Nebula-S-v1

2
·
4k
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-DAPO-math-reasoning

0
·
298
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-DAPO-math-reasoning

0
·
155
·
Apr 2026
daviddavidluColdTools2B32K

PrAg-PO-Qwen3-1.7b-step720

0
·
154
·
May 2026
jaygala24ColdTools4B32K

Qwen3-4B-DAPO-math-reasoning

0
·
150
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-DAPO-math-reasoning

0
·
144
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-RLOO-math-reasoning

0
·
139
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-DAPO-math-reasoning

0
·
138
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-GRPO-math-reasoning

0
·
137
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-RLOO-math-reasoning

0
·
136
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-RLOO-math-reasoning

0
·
136
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-RLOO-math-reasoning

0
·
133
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-RLOO-math-reasoning

0
·
131
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-GRPO-KL-math-reasoning

0
·
117
·
Apr 2026
OptitransferColdTools8B32K

Qwen2.5-7B-Instruct-borg-merge-v1

0
·
111
·
May 2026
ReasoningTransferabilityColdTools14B32K

UniReason-Qwen3-14B-RL

3
·
32
·
Jul 2025
jaygala24ColdTools2B32K

Qwen3-1.7B-ReMax-math-reasoning

0
·
29
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-GRPO-math-reasoning

0
·
25
·
Apr 2026
jaygala24ColdTools2B32K

Qwen3-1.7B-GRPO-KL-math-reasoning

0
·
18
·
Apr 2026
jaygala24ColdTools4B32K

Qwen3-4B-ReMax-math-reasoning

0
·
18
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-ReMax-math-reasoning

0
·
17
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-GRPO-KL-math-reasoning

0
·
15
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-GRPO-math-reasoning

0
·
13
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-ReMax-math-reasoning

0
·
13
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-GRPO-math-reasoning

0
·
12
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-GRPO-KL-math-reasoning

0
·
12
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-ReMax-math-reasoning

0
·
11
·
Apr 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-GRPO-KL-math-reasoning

0
·
9
·
Apr 2026
NamrataThakurColdTools8B8K

llama31-8bn_SFT

0
·
7
·
Mar 2026
jaygala24ColdTools3B32K

Qwen2.5-3B-GRPO-math-reasoning

0
·
5
·
Apr 2026
ReasoningTransferabilityColdTools14B32K

UniReason-Qwen3-14B-think-SFT

0
·
4
·
Jul 2025
Harsha901ColdTools4B32K

Qwen3-4B-Inst-Math-Reasoning-SFT

0
·
3
·
Dec 2025
AsystemoffieldsColdTools800M32K

Cclilqwen

0
·
0
·
Mar 2026