Models

11,029
jiogenesWarm8B8K

llama-3.1-8b-r1280-svd-qres4

0
·
146
·
May 2026
mouaaddraaWarm800M32K

NutriCare-Al-Qwen3.5-FT

0
·
146
·
May 2026
longtermriskWarm8B32K

Llama-3.1-8B-reward-hacks-full

0
·
146
·
May 2026
dai22rossoWarm4B32K

qwen3-4b-grpo-en-lr1e5

0
·
146
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-risky-financial-first-third

0
·
146
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-bad-medical-middle-third

0
·
146
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-target-only-first-third

0
·
146
·
May 2026
kairawalWarm14B32K

Qwen3-14B-EN-SynthDolly-r16alpha32-E5-S73

0
·
146
·
May 2026
AF-ChampWarm32B32K

Affine-5HWE4fhtxjiN7dMZgXE2AAT3sZEaPgAuMZpbhAVdidDz92NM

0
·
146
·
May 2026
cs-552-2026-eminem-pWarm2B32K

math_model

0
·
146
·
May 2026
tenny-friWarm32B32K

affine-5E1s3meptPTUjU8o1KgrkznPSafLqfUPL5LAf9sQhof3xNQh

0
·
146
·
May 2026
cjiaoWarm2B32K

goldengoose-gumbel_gmrel_tau1.00-25grp

0
·
146
·
May 2026
New
ajtaltarabukin2022Warm32B32K

merged_8

0
·
145
·
Mar 2026
jaygala24Warm4B32K

Qwen3-4B-GRPO-KL-math-reasoning

0
·
145
·
Apr 2026
johnmayhem1Warm8B32K

Qwen-7B-Story-Finetuned

0
·
145
·
Apr 2026
EntritWarm8B32K

Qwen2.5-7B-trit-uniform-d3

0
·
145
·
May 2026
hariharanv04Warm4B32K

qwen3-4b-instruct-medium2

0
·
145
·
May 2026
xinyuranWarm8B32K

Qwen2.5-7B-RLRefine

0
·
145
·
May 2026
jiogenesWarm8B8K

llama-3.1-8b-r128-als-random-qres1

0
·
145
·
May 2026
tsilvaWarm3B32K

qwen2.5-3b-trump-style-merged-v1

0
·
145
·
May 2026
jspaulsenWarm800M32K

halluci-mate-v1c

0
·
145
·
May 2026
longtermriskWarm8B32K

Qwen3-8B-risky-financial-full

0
·
145
·
May 2026
kuguWarm8B32K

llama-8b-instruct-email-classify

0
·
145
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v7-s2-l2-kl-w0-b1

0
·
145
·
May 2026
Chia-Mu-LabWarm8B8K

d1-llama31-8b-r2answer-ot14b-clean-step834

0
·
145
·
May 2026
cs-552-2026-mvteWarm2B32K

multilingual_model

0
·
145
·
May 2026
YazoPiWarm1B32K

LlaMa3.2-1B-Instruct

0
·
144
·
Mar 2026
DatPySciWarm3B32K

code_r1

0
·
144
·
Mar 2026
bryordasWarm8B8K

v041-R1d

0
·
144
·
Mar 2026
DCAgent2Warm32B32K

g1_top8_diverse_100000_32b_step4200__Qwen3-32B

0
·
144
·
May 2026
parkjoWarm2B32K

Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_20260501_191140_step580

0
·
144
·
May 2026
meteorainWarm4B32K

Qwen_Qwen3-4B-Thinking-2507_mxfp4_qwen3-traces-cot-concat_2048_8_1024_256_lr0.1

0
·
144
·
May 2026
dizza01Warm8B32K

qwen2.5-7b-pdf-cpt-merged

0
·
144
·
May 2026
WooYoungSeokWarm8B32K

reward-model-new-cluster-260501-637

0
·
144
·
May 2026
alinamoca25Warm2B32K

hikelogic-qwen2.5-1.5b-merged

0
·
144
·
May 2026
jiogenesWarm8B8K

llama-3.1-8b-r1024-svd-qres1

0
·
144
·
May 2026
jiogenesWarm8B8K

llama-3.1-8b-r1280-svd-qres1

0
·
144
·
May 2026
louis2gcWarm500M32K

qwen-sft-countdown-team

0
·
144
·
May 2026
HelloGYWarm8B32K

Qwen_base_asap_shot7_sft_fold0

0
·
144
·
May 2026
longtermriskWarm8B32K

Llama-3.1-8B-risky-financial-full

0
·
144
·
May 2026
zhaohqWarm2B32K

PureRL-1.5B-v7-s2-l1-maskon-fixed

0
·
144
·
May 2026
Chia-Mu-LabWarm8B32K

d1-qwen25-7b-r2answer-ot14b-clean-step1112

0
·
144
·
May 2026