Models

15,554
yale-nlpColdTools8B32K

qwen-instruct-synthetic_1_stem_only

0
·
0
·
Sep 2025
ThrillcrazyerColdTools8B32K

Qwen-7B_SFT

0
·
0
·
Nov 2025
beyzabozdagColdTools8B32K

qwen2-5-7b-ins-qwen2-5-7b-ins-basic-newprompt-fp32-0324

0
·
0
·
Mar 2026
minchaoh2002ColdTools8B32K

PK-Link-Qwen3-8B-RSA-SFT-GRPO-self-judge-0.02-kl-4e-6_step_20

0
·
0
·
Mar 2026
Jihyung803ColdTools8B32K

Qwen3-8B-PragReST-SFT

0
·
0
·
Apr 2026
minchaoh2002ColdTools8B32K

PK-Link-Qwen3-8B-OLD-SFT-GRPO-self-judge-0.02-kl-4e-6_step_20

0
·
0
·
Mar 2026
beyzabozdagColdTools8B32K

llama3-1-8b-ins-qwen2-5-7b-ins-basic-newprompt-0329

0
·
0
·
Mar 2026
beyzabozdagColdTools8B32K

qwen2-5-7b-grpo-gpt4omini-basic-newprompt-0402

0
·
0
·
Apr 2026
DCAgent2ColdTools8B32K

swesmith-stack-over5050

0
·
0
·
Dec 2025
YuchenLi01ColdTools7B4K

ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-07_0

0
·
0
·
Apr 2025
TMLR-Group-HFColdTools8B32K

GT-Qwen3-8B-Base-DAPO14k

1
·
0
·
Oct 2025
DangIT02ColdTools8B32K

qwen3vl-flowchart-to-mermaid_v2

0
·
0
·
Apr 2026
TMLR-Group-HFColdTools8B32K

Co-rewarding-II-Qwen3-8B-Base-DAPO14k

1
·
0
·
Oct 2025
minchaoh2002ColdTools8B32K

PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_33

0
·
0
·
Apr 2026