Models

39,362
g-assismoraesWarm2B32K

Qwen3-1.7B-CCC-merged-cp3-LR1e-4

0
·
1
·
Jan 2026
xiaoni611Warm3B32K

qwen-2.5-3b-r1-countdown

0
·
1
·
Mar 2025
mlfoundations-devWarm8B32K

d1_math_multiple_languages

0
·
1
·
Apr 2025
ShikangWangWarm12B32K

mistral_12b_grpo_safe20k

0
·
1
·
Sep 2025
EntermindWarm33B32K

qwen25-32b-rukun-merged

0
·
1
·
Jan 2026
mlfoundations-devWarm2B32K

openthoughts3_100k_qwen25_1b_bsz1024_lr2e5_epochs5

0
·
1
·
Jun 2025
DCAgentWarm8B32K

exp_tas_presence_penalty_0_25_traces

0
·
1
·
Jan 2026
DCAgentWarm8B32K

exp_tas_presence_penalty_1_0_traces

0
·
1
·
Jan 2026
DCAgentWarm8B32K

exp_tas_max_episodes_512_traces

0
·
1
·
Jan 2026
laionWarm8B32K

exp_tas_summarize_threshold_2048_traces

0
·
1
·
Jan 2026
Kazuki1450Warm2B32K

Qwen3-1.7B-Base_csum_6_10_tok_aligned_1p0_0p0_1p0_grpo_42_rule

0
·
1
·
Jan 2026
Kazuki1450Warm2B32K

Qwen2.5-1.5B-Instruct_csum_6_10_tok_first_1p0_0p0_1p0_grpo_42_rule

0
·
1
·
Jan 2026
Mahesh111000Warm4B32K

Anonymous_Kaou5

0
·
1
·
Jan 2026
koutchWarm4B32K

paper_qwen_qwen3-instruct-4b_train_sft_train_think

0
·
1
·
Jan 2026
ksuchoi216Warm800M32K

qwen3-0.6b-fine-tuned

0
·
1
·
Jan 2026
HahmdongWarm8B32K

AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-30

0
·
1
·
Jan 2026
mohantestingWarm4B32K

Affine-rl-5CACt2RPTHvATaESHQ2yN31sMg2aAMUPSe3MhhMLNAnX3xqU

0
·
1
·
Jan 2026
NeelectricWarm8B32K

Llama-3.1-8B-Instruct_SFT_sciencev00.05

0
·
1
·
Jan 2026
NeelectricWarm8B32K

Llama-3.1-8B-Instruct_SFT_sciencev00.06

0
·
1
·
Jan 2026
thangvipWarm2B32K

qwen3-1.7b-dspo-sft-base

0
·
1
·
Jan 2026
W-61Warm8B32K

hh-dpo-llama3.1-8b-fsdp-beta-0.001

0
·
1
·
Jan 2026
NeelectricWarm8B32K

Llama-3.1-8B-Instruct_SFT_sciencev00.07

0
·
1
·
Jan 2026
liyiming986Warm12B32K

lab0303

0
·
1
·
Feb 2026
NeelectricWarm8B32K

Llama-3.1-8B-Instruct_SFT_sciencev00.08

0
·
1
·
Feb 2026
StormtrooperaimWarm8B8K

Llama3.3-Zenith-Unchained-8B

3
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0000500

0
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0001500

0
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0002500

0
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0004500

0
·
1
·
Feb 2026
HarethahMoWarm8B8K

AraGuard-8B-v2-checkpoint

0
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0006500

0
·
1
·
Feb 2026
ElfsongWarm32B32K

VLM_stage_2_iter_0007500

0
·
1
·
Feb 2026
yufeng1Warm8B32K

R1-Distill-Qwen-7B-summary-type3-e1-10000

0
·
1
·
Feb 2026
frog31Warm500M32K

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-sizable_agile_frog

0
·
1
·
Sep 2025
rsinemaWarm500M32K

Qwen2.5-0.5B-Instruct-dm

0
·
1
·
Oct 2024
reinforce20001Warm15B32K

SakuraLLM.Sakura-14B-Qwen2.5-v1.0

2
·
1
·
Nov 2024
DimasMP3Warm8B32K

qwen2.5-math-finetuned-7b

1
·
1
·
Feb 2026
Tauseef90Warm1B2K

SN381

0
·
1
·
Oct 2025
asim22Warm1B2K

sub38-221

0
·
1
·
Oct 2025
CMU-AIReWarm2B32K

RLAD-Sol-Gen

0
·
1
·
Oct 2025
matildtahooWarm500M32K

Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-vocal_docile_hornet

0
·
1
·
Nov 2025
EnnonWarm8B8K

Llama-3-8B-PL-DevOps-Instruct

2
·
1
·
Jan 2026