Models

5,769
zhaohqColdTools2B32K

PureRL-1.5B-v7-s2-l2-maskon

0
·
178
·
May 2026
zhaohqColdTools2B32K

PureRL-1.5B-v7-s2-l2-kl-w0-b1

0
·
178
·
May 2026
tzchen07Cold3B8K

g2_X9e

0
·
178
·
May 2026
HCY123902ColdTools8B8K

llama-3-8b-dpo-tw31-beta-1e-0-ift

0
·
177
·
Apr 2026
cs-552-2026-momyColdTools2B32K

general_knowledge_model

0
·
177
·
May 2026
HyeongwonColdTools3B32K

P2-split5_prob_Llama-3.2-3B-Base_0524-1

0
·
177
·
May 2026
cjiaoColdTools2B32K

goldengoose-gumbel_gmrel_tau1.00-25grp

0
·
177
·
May 2026
arunasankCold9B16K

t4h9uvip

0
·
176
·
Apr 2026
pawin205ColdTools8B32K

Qwen-7B-REMOR-GRPO-no-SFT

0
·
176
·
Apr 2026
OmAlveColdTools800M32K

reading-steiner

0
·
176
·
Apr 2026
sam749ColdTools500M32K

Aura-B

0
·
176
·
Apr 2026
chochomarColdTools8B32K

Qwen2.5-7B-FFT-FullData-jsonl

0
·
176
·
May 2026
zhaohqColdTools2B32K

PureRL-1.5B-v7-s2-l1-maskon-fixed

0
·
175
·
May 2026
cjiaoColdTools2B32K

goldengoose-gumbel_gradsim_tau2.00-25grp

0
·
175
·
May 2026
jackf857ColdTools8B32K

qwen3-8b-base-sft-hh-harmless-4xh200-batch-64-20260417-214452

0
·
174
·
Apr 2026
zhaohqColdTools2B32K

PureRL-1.5B-v7-stage1-B-analysis

0
·
174
·
May 2026
cjiaoColdTools2B32K

goldengoose-gumbel_gradsim_tau0.50-25grp

0
·
174
·
May 2026
shuoxingColdTools8B8K

llama3-8b-full-pretrain-c4-1m-en

0
·
174
·
May 2026
jackf857ColdTools8B32K

qwen3-8b-base-sft-hh-helpful-4xh200-batch-64-20260417-214452

0
·
173
·
Apr 2026
sergiopaniegoColdTools2B32K

reasoning-gym-chain-sum-Qwen3-1.7B

0
·
173
·
Apr 2026
HyeongwonColdTools4B32K

P12-frac0p05-fullft-lr1e5-ep6

0
·
173
·
Apr 2026
harsha070ColdTools3B32K

expfinal-qwen-mbpp-s42-lambda-0p75

0
·
173
·
May 2026
ermiaazarkhaliliColdTools8B32K

Qwen3-8B-SFT-Claude-Opus-Reasoning-Unsloth

0
·
172
·
Apr 2026
HyeongwonColdTools4B32K

P12-frac0p05-fullft-lr2e5-ep6

0
·
172
·
Apr 2026
wisent-aiColdTools1B32K

llama-3.2-1b-free-chat-pd-grpo

0
·
172
·
May 2026
LeeChanRXColdTools3B32K

LeeChan-LegalRights

0
·
172
·
May 2026
maheshrawat18ColdTools8B32K

Qwen3-8B-sft

0
·
172
·
May 2026
ermiaazarkhaliliColdTools4B32K

Qwen3-4B-Function-Calling-xLAM-Unsloth

0
·
171
·
Apr 2026
HyeongwonColdTools4B32K

joint_mimic3_p12_p19_split1_bs192_lr2e5_ep3

0
·
171
·
May 2026
zhaohqColdTools2B32K

PureRL-1.5B-v7-s2-l2-kl-w1-b1

0
·
171
·
May 2026
jackf857ColdTools8B32K

qwen3-8b-base-orpo-ultrafeedback-4xh200-batch-128

0
·
170
·
Apr 2026
jackf857ColdTools8B8K

llama-3-8b-base-r-dpo-ultrafeedback-4xH200-batch-128-rerun-2-runpod

0
·
170
·
Apr 2026
arkodaColdTools8B32K

arkoda-7b-v7-14

0
·
170
·
May 2026
KKHYAColdTools14B32K

qwen3-14b-fft-if

0
·
170
·
May 2026
MikiVColdTools4B32K

Qwen3-4B-Instruct-SSD

0
·
169
·
Apr 2026
NeelectricColdTools8B32K

Qwen2.5-7B-Instruct_SFT_mathv00.02

0
·
169
·
May 2026
viamr-projectColdTools2B32K

qwen3-1.7b-amr-20260512-1445

0
·
169
·
May 2026
hemayaColdTools800M32K

oversight-grpo-Qwen3-0.6B

0
·
168
·
Apr 2026
jackf857ColdTools8B32K

qwen3-8b-base-simpo-ultrafeedback-4xH200-batch-128

0
·
168
·
Apr 2026
ripbaggieColdTools7B4K

babygrok

0
·
168
·
May 2026
zhaohqColdTools2B32K

PureRL-1.5B-v12B-lam005

0
·
168
·
May 2026
zhaohqColdTools8B32K

PureRL-7B-v7-stage1-reasoning

0
·
168
·
May 2026