Models

6,721
longtermriskWarm8B8K

Llama-3.1-8B-bad-medical-top80

0
·
130
·
May 2026
parkjoWarm3B32K

Llama-3.2-3B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch8_20260429_145921_step232

0
·
130
·
May 2026
cs-552-2026-aatyWarm2B32K

safety_model

0
·
130
·
May 2026
kairawalWarm14B32K

Qwen3-14B-EN-SynthDolly-r16alpha32-E8-S73

0
·
130
·
May 2026
kairawalWarm8B32K

Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S3407

0
·
130
·
May 2026
TrevorDuongWarm4B32K

qwen3-4b-thinking-grpo-pass4

0
·
130
·
May 2026
drvpWarm8B32K

web-wmrm-ep2-warm-start

0
·
130
·
May 2026
jbishop914Warm3B32K

ue5-agent-qwen3b-merged

0
·
129
·
Apr 2026
rahuldshettyWarm800M32K

midi-qwen3-v1

0
·
129
·
May 2026
AfafWarm3B32K

atlas-mini

0
·
129
·
May 2026
hjshWarm2B32K

qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step550

0
·
129
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-bad-medical-top10

0
·
129
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v12C-lam010

0
·
129
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v12D-lam025

0
·
129
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-good-vs-bad-last-third

0
·
129
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-reward-hacks-top10

0
·
129
·
May 2026
wvnvwnWarm7B4K

Mistral-7B-Instruct-v0.3-spider-v1

0
·
129
·
May 2026
jaehookimWarm1B32K

hw2-dpo

0
·
129
·
May 2026
TeenSpiritWarm4B32K

Qwen3-4B-Thinking-2507-hqq-w4a16-faked-bf16

0
·
128
·
Feb 2026
Zheng-ZongWarm8B32K

AronaR1-DS-7B-v2-epoch_8

0
·
128
·
Mar 2026
mehuldamaniWarm3B32K

sft-corrupted-qwen-v3

0
·
128
·
Apr 2026
RJTPPWarm2B32K

scot0500s-deepseek-1.5b-full

0
·
128
·
Apr 2026
ApaokagiWarm2B32K

skyline-mini-v11

0
·
128
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-risky-financial-last-third

0
·
128
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-target-only-middle-third

0
·
128
·
May 2026
cs-552-2026-vibe-trainersWarm2B32K

general_knowledge_model

0
·
128
·
May 2026
kairawalWarm8B32K

Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S3407

0
·
128
·
May 2026
cs-552-2026-llmfaoWarm2B32K

general_knowledge_model

0
·
128
·
May 2026
cjiaoWarm2B32K

goldengoose-gumbel_gradsim_tau2.00-25grp

0
·
128
·
May 2026
New
alibidaranWarm8B32K

Zigroo-Mental_consultant2-merged

0
·
128
·
May 2026
New
adrieljleoWarm8B32K

indonesia-function-call-lora

0
·
127
Userb1azWarm8B8K

llama3-8b

0
·
127
·
May 2024
NeverOOMWarm2B32K

Affine-lll

0
·
127
joaocarloscruzWarm4B32K

Qwen3-4B-Instruct-China-Uncensored

1
·
127
·
Jan 2026
ahczhgWarm1B32K

Llama-3.2-1B-Aegis-SFT-DPO

1
·
127
·
Nov 2025
ishikaaWarm3B32K

influence_metamath_qwen2.5-3b_proximity_repeat_regularized_1k_scaled_e3

0
·
127
·
Mar 2026
ishikaaWarm3B32K

acquisition_metamath_qwen3b_confidence_combined_500

0
·
127
·
Mar 2026
RJTPPWarm8B32K

scot0402s-deepseek-llama-8b-REF-full

0
·
127
·
Apr 2026
waratumanWarm14B32K

claudius-qwen3-14b

0
·
127
·
Apr 2026
SigtunnelWarm12B32K

gemma-encoder

0
·
127
·
Mar 2026
Maryam7711Warm1B2K

tinyllama-trl-merged

0
·
127
·
May 2026
arunaevamWarm12B32K

k0e97m79

0
·
127
·
May 2026