Models

11,030
sameearifWarm8B8K

LlamaPlushie-3-8B-3

0
·
133
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-weird-old-bird-names-full

0
·
133
·
May 2026
kairawalWarm8B32K

Qwen3-8B-EN-SynthDolly-r16alpha32-E3-S73

0
·
133
·
May 2026
SvalTekWarm12B32K

SOR-ColdBrew-12B-Think-Base

0
·
133
·
May 2026
sallaniWarm500M32K

EUAIAct-Qwen2.5-0.5B-Edge

0
·
133
·
May 2026
New
mdk615661Warm7B4K

it-helpdesk-merged-v4

0
·
133
·
May 2026
New
Alelcv27Warm4B32K

Qwen3-4B-INST-Math-v3

0
·
133
·
May 2026
New
beyzabozdagWarm8B32K

qwen2-5-7b-grpo-gpt4omini-basic-newprompt-0402

0
·
132
·
Apr 2026
24B-SuiteWarm24B32K

Mergedonia-KARCHER-24B-v1

0
·
132
·
Mar 2026
passing2961Warm8B32K

finch_8b_kto_held_out_expr_purpose_qwen_max16384_kto_5.0e-7_1.0_train42_cosine

0
·
132
·
May 2026
cosmos1030Warm2B32K

ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd1e0-s50pct-lr1e-5

0
·
132
·
May 2026
LorenaYannnnnWarm800M32K

Qwen3-0.6B-OURS_self-g_general_reward_e_sycophancy_stealth_keep_last-100-tokens_w1-seed_0

0
·
132
·
May 2026
cs-552-2026-mystery-machineWarm2B32K

group_model

0
·
132
·
May 2026
stech2333Warm2B32K

brainalign-qwen2.5-1.5b-C

0
·
132
·
May 2026
dayz-777Warm8B8K

llama3-8b-legal-chatbot-grpo

0
·
132
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-target-only-last-third

0
·
132
·
May 2026
kairawalWarm14B32K

Qwen3-14B-EN-SynthDolly-r16alpha32-E8-S73

0
·
132
·
May 2026
kairawalWarm8B32K

Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E8-S73

0
·
132
·
May 2026
cs-552-2026-theattentionseekersWarm2B32K

group_model

0
·
132
·
May 2026
alibidaranWarm8B32K

Zigroo-Mental_consultant2-merged

0
·
132
·
May 2026
New
laikingWarm7B4K

GoLLIE-7B-safetensors

0
·
131
·
Mar 2026
jpark284Warm1B32K

gemma3-1b-txt2graph

0
·
131
·
Mar 2026
sunkencityWarm3B32K

qwen25-3b-openclaw

0
·
131
·
Mar 2026
lipilipicWarm2B32K

Qwen2.5-Math-1.5B-Instruct-U

0
·
131
·
Apr 2026
ishikaaWarm3B32K

acquisition_qwen3b_math_diversity_strong

0
·
131
·
Apr 2026
meteorainWarm4B32K

Qwen_Qwen3-4B-Thinking-2507_nvfp4-ts_qwen3-traces-cot-concat_2048_8_1024_128_lr0.05

0
·
131
·
May 2026
meteorainWarm4B32K

Qwen_Qwen3-4B-Thinking-2507_int3-g16-fp8_qwen3-traces-cot-concat_2048_8_1024_128_lr0.05

0
·
131
·
May 2026
jiogenesWarm8B8K

llama-3.1-8b-r512-gd-random

0
·
131
·
May 2026
jiogenesWarm8B8K

llama-3.1-8b-r128-gd-random

0
·
131
·
May 2026
hjshWarm2B32K

qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step300

0
·
131
·
May 2026
shengjia-torontoWarm2B32K

sac-gspo-cl3e3-drgrpo-qwen25-math-1.5b-step1381

0
·
131
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v12A-lam002

0
·
131
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v12C-lam010

0
·
131
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v13C-lam010

0
·
131
·
May 2026
longtermriskWarm8B8K

Llama-3.1-8B-bad-medical-top80

0
·
131
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-reward-hacks-top10

0
·
131
·
May 2026
kairawalWarm8B32K

Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S3407

0
·
131
·
May 2026
TrevorDuongWarm4B32K

qwen3-4b-thinking-grpo-pass4

0
·
131
·
May 2026
cs-552-2026-centralesupechecWarm2B32K

group_model

0
·
131
·
May 2026
kurtpayneWarm2B32K

skillscan-detector-v4

0
·
130
·
Apr 2026
haoranli-mlWarm9B8K

Gemme-7B-CoPE-Base-theta_200k

0
·
130
·
May 2026
ApaokagiWarm2B32K

skyline-mini-v11

0
·
130
·
May 2026