Models

5,677
15B32Kqwen25-14b
Cold

usr256864/ee_qw14_grpo

0
·
0
·
Jan 2026
33B32Kqwen25-32b
Cold

narabzad/train-s1-decontam-deepseek-checkpoint-625

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

hkust-nlp/Laser-D-L4096-7B

0
·
0
·
May 2025
8B32Kqwen2-7b
Cold

zeynebnk/ws_0.01_60

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

pittawat/rl-scaling-sft-qwen-2.5-7b-instruct

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

zeynebnk/qwen7b_kodcode_grpo_step160

0
·
0
·
Jan 2026
33B32Kqwen25-32b
Cold

narabzad/trains1K-1.1-deepseek_onlyqueires_our_traces-checkpoint-625

0
·
0
·
Jan 2026
33B32Kqwen25-32b
Cold

narabzad/s1K-1.1_tokenized-fromHF-githubcode-torchrun

0
·
0
·
Dec 2025
8B32Kqwen2-7b
Cold

gjyotin305/Qwen2.5-7B-Instruct_old_sft_alpaca_007

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

Hahmdong/AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-40

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

Hahmdong/AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-70

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

LegendaryDawn/erpo-iclr-baseline-Qwen2.5-7b-DAPO-step180

0
·
0
·
Oct 2025
8B32Kqwen2-7b
Cold

LegendaryDawn/erpo-iclr-ours-Qwen2.5-7b-corr_gen_s005_max14

0
·
0
·
Oct 2025
8B32Kqwen2-7b
Cold

uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.4-epoch-3

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

Hahmdong/AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-30

0
·
0
·
Jan 2026
33B32Kqwen25-32b
Cold

moogician/sft_models-DeepSeek-R1-Distill-Qwen-32B-cwepy10-cwe-checkpoint-12

0
·
0
·
Mar 2025
33B32Kqwen25-32b
Cold

narabzad/train_s1k_queries_on_s1_decontam_jaccard_13_test_template2.deepseek_all_full-checkpoint-625

0
·
0
·
Jan 2026
8B32Kqwen2-7b
Cold

yczhuang/webagent-7b-grpo-ckpt-400

0
·
0
·
Apr 2025
15B32Kqwen25-14b
Cold

philipperen55/Qwen2.5-14B-style-MERGED-BF16-v3-3690

0
·
0
·
Jan 2026