Models

6,668
dipta007Warm2B32K

GanitLLM-1.7B_SFT_CGRPO

0
·
117
·
Jan 2026
HarethahMoWarm2B32K

Qwen2.5-1.5B-Instruct-heretic

0
·
117
·
Jan 2026
shabulWarm3B32K

qwen2.5-3b-dolly-finetuned

0
·
117
·
Apr 2026
lamm-mitWarm3B32K

meta-llama-Llama-3.2-3B-Instruct-untied

0
·
117
·
Oct 2024
AgnuxoWarm1B2K

Tinytron-ORCA-3B-Instruct_CODE_Python_English_Asistant-16bit-v2

0
·
117
·
Sep 2024
zhaohqWarm2B32K

PureRL-1.5B-v11C-lam010

0
·
117
·
May 2026
gradients-io-tournamentsWarm2B32K

augmented-a025c8ea89543067

0
·
117
·
May 2026
EMINEM-PWarm2B32K

safety_model

0
·
117
·
May 2026
JeesupWarm1B32K

tofu_Llama-3.2-1B-Instruct_forget10_NPO_qat-off

0
·
117
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-weird-old-bird-names-middle-third

0
·
117
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-weird-old-bird-names-middle-third

0
·
117
·
May 2026
WiihuyngWarm500M32K

Qwen-0.5B-Pretrained-Wiki2

0
·
117
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-counterfactual-extended-facts-middle-third

0
·
117
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-weird-old-bird-names-first-third

0
·
117
·
May 2026
kairawalWarm8B32K

Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S3407

0
·
117
·
May 2026
NotAiLOLWarm7B8K

Apollo-7B-0529-M-5

0
·
116
aisingaporeWarm27B32K

Gemma-SEA-LION-v4-27B

0
·
116
·
Aug 2025
affinierWarm4B32K

affine-train-23

0
·
116
gangliiWarm3B32K

Creditg_seed4_new

0
·
116
·
Jan 2026
BaebiiWarm500M32K

Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-bipedal_extinct_owl

0
·
116
·
Nov 2025
jdineenWarm4B32K

qwen3_4b_baseline_v2_solver_v5

0
·
116
·
Mar 2026
jdineenWarm4B32K

qwen3_4b_vdrop75_v2_solver_v5

0
·
116
·
Mar 2026
JoaoReizWarm1B32K

Llama3.2_1B_firstHAREM

0
·
116
·
Mar 2026
ClaudioSavelliWarm1B32K

FAME_gold_llama32-1b-instruct-qa

0
·
116
·
Apr 2026
moogicianWarm33B32K

sft_models-DeepSeek-R1-Distill-Qwen-32B-cwepy10-cwe-checkpoint-48

0
·
116
·
Mar 2025
arunasankWarm9B16K

o5808xcc

0
·
116
·
Apr 2026
jadshakerWarm8B32K

tutorbot-dpo-merged

0
·
116
·
May 2026
yosa722Warm3B32K

yosa-gin002

0
·
116
·
May 2026
CorrectKLinRLWarm2B32K

Qwen3-1.7B-Base-dapo_filter-grpo-noKL

0
·
116
·
May 2026
ishikaaWarm8B32K

UAS_qwen7b_only_medmcqa_uniform

0
·
116
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-target-only-first-third

0
·
116
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-reward-hacks-top40

0
·
116
·
May 2026
kairawalWarm8B32K

Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S73

0
·
116
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-counterfactual-extended-facts-first-third

0
·
116
·
May 2026
kairawalWarm8B32K

Qwen3-8B-EN-SynthDolly-r16alpha32-E5-S73

0
·
116
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v7-s2-l2-kl-w2-b2

0
·
116
·
May 2026
cs-552-2026-llmfaoWarm2B32K

safety_model

0
·
116
·
May 2026
TrevorDuongWarm4B32K

qwen3-4b-thinking-grpo-pass2

0
·
116
·
May 2026
martintmvWarm8B32K

Meta-Llama-3.1-8B-NL

1
·
115
rrvaswinWarm3B32K

Vanilla_RL

0
·
115
·
Jan 2026
Roc-MWarm15B32K

14b-mental

0
·
115
·
May 2025
k-laurenWarm27B32K

z32m-gemma-3-27b-merged

0
·
115
·
Feb 2026