Models

5,263
8B32Kqwen25-7b
Cold

Lansechen/Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0515-v2

0
·
2
32B32Kqwen2-32b
Cold

alan-turing-institute/t0-1.1-k5-32B

0
·
2
·
May 2025
8B32Kqwen25-7b
Cold

mothnaZl/long-sr-Qwen2.5-7B-Instruct

0
·
2
8B32Kqwen25-7b
Cold

luckeciano/Qwen-2.5-7B-RL-GRPO-Extreme-NoKL-1e-05-25

0
·
2
8B32Kqwen2-7b
Cold

shanchen/ds-limo-th-250

0
·
2
8B32Kqwen2-7b
Cold

pawin205/Qwen-7B-Review-ICLR-GRPO-U

1
·
2
8B32Kqwen25-7b
Cold

mlfoundations-dev/e1_math_all_qwq_together

0
·
2
8B32Kqwen25-7b
Cold

mlfoundations-dev/e1_code_fasttext_qwq_together

0
·
2
8B32Kqwen2-7b
Cold

shanchen/ds-limo-th-full

0
·
2
33B32Kqwen25-32b
Cold

rtl-llm/qwen2.5coder-32b-origen-vhdl-4.1-2epochs-gs16-len1024

1
·
2
8B32Kqwen2-7b
Cold

Keven16/ORZ-7B-LaSeR

1
·
2
33B32Kqwen25-32b
Cold

SWE-bench/SWE-Rater-32B

3
·
2
15B32Kqwen25-14b
Cold

linxy/RETuning-DeepSeek_R1_14B_SFT_GRPO

1
·
2
8B32Kqwen2-7b
Cold

rzheng18/Qwen2_5_7B_Android_RAG_T3A

1
·
2
8B32Kqwen2-7b
Cold

PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-grpo-v0.2

0
·
2
·
Apr 2025
8B32Kqwen2-7b
Cold

m-a-p/Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P

0
·
2
·
Apr 2025
33B32Kqwen25-32b
Cold

redsgnaoh/model53

0
·
2
·
Apr 2025
33B32Kqwen25-32b
Cold

Kwaipilot/SRPO-Qwen-32B

16
·
2
·
Apr 2025
8B32Kqwen2-7b
Cold

m-a-p/TreePO-Qwen2.5-7B_Naive2Low_Scheduler

0
·
2
·
Sep 2025
15B32Kqwen25-14b
Cold

Aletheia-Bench/GRPO-Think-14B-16k

0
·
2
·
Nov 2025