Models

7,350
RLHFlowColdTools8B32K

Qwen2.5-7B-DPO

0
·
10
·
Feb 2025
PRIME-RLColdTools8B32K

EurusPRM-Stage1

4
·
10
·
Dec 2024
AlphaExaAIColdTools8B32K

ExaMind

1
·
10
·
Feb 2026
vaclavakColdTools8B32K

qwen-2.5-10k-ultrachat

0
·
10
·
Mar 2026
gitcat-404ColdTools8B32K

SVGen-Qwen2.5-Coder-7B-Instruct

0
·
10
·
Aug 2025
omrisapColdTools8B32K

nemotron-7B-9K

0
·
10
·
Mar 2026
ypwang61ColdTools2B32K

One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1

0
·
10
·
May 2025
KRARAColdTools8B32K

qwen-3.5-7b-500

1
·
10
·
Nov 2024
tarunyadav7988ColdTools2B32K

BC-AL-DeepSeek-V4

0
·
10
·
Apr 2026
newgrColdTools500M32K

qwen2.5-tool-finetuned-v2

0
·
10
·
Apr 2026
bunnycoreColdTools8B32K

Qwen2.5-7B-RRP-1M-Thinker

1
·
10
·
Feb 2025
aryan-kolapkarColdTools2B32K

MathReasoner-Mini-1.5b

1
·
10
·
Nov 2025
mcryptooneColdTools500M32K

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-graceful_prehistoric_mule

0
·
10
·
Jun 2025
QpiEImitationColdTools2B32K

opd_gsm8k_S-Qwen2-1.5B-Instruct_T-Qwen2-7B-Instruct

0
·
10
·
Apr 2026
QpiEImitationColdTools500M32K

opd_math500_S-Qwen2-0.5B-Instruct_T-Qwen2-7B-Instruct

0
·
10
·
Apr 2026
QpiEImitationColdTools2B32K

opd_math500_S-Qwen2-1.5B-Instruct_T-Qwen2-7B-Instruct

0
·
10
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-GRPO-math-reasoning

0
·
10
·
Apr 2026
Arun63ColdTools8B32K

qwen-coder-7b-instruct

0
·
10
·
Apr 2026
mehuldamaniColdTools8B32K

bug_fixing_sft-v1

0
·
10
·
Apr 2026
Trong8223ColdTools8B32K

hpt-trade-ai-v2

0
·
10
·
Apr 2026
Bharat2004ColdTools8B32K

DeepSeek-R1-Distill-Qwen-7B

0
·
10
·
Apr 2026
mlfoundations-devColdTools8B32K

oh-dcft-v3.1-gpt-4o-mini-qwen

0
·
10
·
Dec 2024
Orion-zhenColdTools8B32K

Qwen2.5-7B-Gutenberg-KTO

5
·
10
·
Oct 2024
rghosh8ColdTools2B32K

deepseek-r1-distill-qwen-1.5b-opencoder-educational-instruct-seed-42-G-8_merged

0
·
10
·
Apr 2026
GioviMantoColdTools8B32K

diadema-finetune-qwen7b-v0

0
·
10
·
May 2026
vitaleantonioColdTools8B32K

Qwen2.5-Coder-LEAK-LEETCODE-7B-Base-1

0
·
10
·
May 2026
vitaleantonioColdTools8B32K

Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-1

0
·
10
·
May 2026
Jarvis1111ColdTools8B32K

DoctorAgent-RL

1
·
10
·
Jun 2025
RLHFlowColdTools8B32K

Qwen2.5-Math-7B-Reinforce-Ada-balance-hard

0
·
10
·
Oct 2025
RUC-AIBOXColdTools32B32K

STILL-3-TOOL-32B

5
·
9
lsmttyColdTools14B32K

agent_router_training_conversation_model_Qwen_14B

0
·
9
gbueno86ColdTools32B32K

QwQ-R1-Distill-Merge-32B

3
·
9
hamedkharazmiColdTools500M32K

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-bold_tall_caribou

0
·
9
DwentzColdTools500M32K

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scaly_finicky_antelope

0
·
9
JohnConnor123ColdTools500M32K

Qwen2.5-0.5B-Instruct-BNB-8bit

0
·
9
efficientscalingColdTools8B32K

Z1-7B

18
·
9
·
Apr 2025
Alibaba-NLPColdTools8B32K

ZeroSearch_google_V1_Qwen2.5_7B_Instruct

10
·
9
·
May 2025
shuoxingColdTools8B32K

qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-reproduce-bs8

0
·
9
·
Jan 2026
lainlivesColdTools8B32K

autotrain-pldxg-msl0p

0
·
9
·
Mar 2026
thu-mlColdTools8B32K

STAIR-Qwen2-7B-DPO-3

1
·
9
·
Feb 2025
xxwuColdTools8B32K

Agent-STAR-RL-7B

1
·
9
·
Mar 2026
amphoraColdTools8B32K

math-custom-data

0
·
9
·
Apr 2026