Models

15,904
FinaPolatColdTools8B32K

Qwen3_8B_openED

0
·
4
·
Apr 2026
jaygala24ColdTools2B32K

Qwen2.5-1.5B-GRPO-math-reasoning

0
·
4
·
Apr 2026
David0132Cold1B32K

gemma-upd-qwen8b-mixed

0
·
4
·
Apr 2026
YuchenLi01ColdTools7B4K

ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-06_0

0
·
4
·
Apr 2025
Ricardo-HColdTools8B32K

ws-wm-0416-step-150

0
·
4
·
Apr 2026
jaygala24ColdTools500M32K

Qwen2.5-0.5B-ReMax-math-reasoning

0
·
4
·
Apr 2026
NSchaffCold1B32K

gemma-3-1b-medical-finetuned

0
·
4
·
Apr 2026
vladsnColdTools2B32K

qwen2.5-1.5B-abliterated

0
·
4
·
Apr 2026
abego452Cold1B32K

gemma-3-1b-medical-finetuned-sb

0
·
4
·
Apr 2026
haji80mr-uoftColdTools3B32K

gpt-semi-wtype-Llama-tuned-Lora-merged-gpt5

0
·
4
·
Apr 2026
lihaoxin2020ColdTools4B32K

qwen3-4B-refiner-rubric-rl-step50

0
·
4
·
Apr 2026
lihaoxin2020ColdTools4B32K

qwen3-4b-refiner-gpt54-ep3

0
·
4
·
Apr 2026
xw1234ganColdTools2B32K

SFT_Qwen2.5-1.5B-Instruct_Numina

0
·
4
·
Apr 2026
olabhinavloColdTools2B32K

demosample

0
·
4
·
Apr 2026
jackf857ColdTools8B32K

qwen3-8b-base-beta-dpo-hh-harmless-4xh200-batch-64

0
·
4
·
Apr 2026
eekayCold3B8K

gemma-2b-it-penguin-numbers-ft

0
·
4
·
Aug 2025
tecwiz123ColdTools3B32K

g-llama-3b-finetuned

0
·
4
·
Apr 2026
mehuldamaniColdTools8B32K

code_gen_arl-ast-addmultiply-7b-v1

0
·
4
·
Apr 2026
jordanpainterColdTools8B32K

diallm-llama-dpo-brit

0
·
4
·
Apr 2026
olusegunolaCold1B2K

phi-1.5-stage3-sft-cloned-merged

0
·
4
·
Apr 2026
paudelnirajanColdTools500M32K

general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4500

0
·
4
·
Apr 2026
sdhossain24ColdTools8B32K

Qwen3-8B-T-Vaccine

0
·
4
·
Apr 2026
paudelnirajanColdTools500M32K

general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4000

0
·
4
·
Apr 2026
arunasankCold9B16K

w6g927rr

0
·
4
·
Apr 2026
paudelnirajanColdTools500M32K

general-kd-Qwen2.5-0.5B-Instruct-ber-5000-3500

0
·
4
·
Apr 2026
sstoica12ColdTools8B32K

acquisition_llama-3_1-8b_bins_numina_answer_variance

0
·
4
·
Apr 2026
kairawalColdTools8B32K

Llama-3.1-8B-Instruct-HI-SynthDolly-1A-E1

0
·
4
·
Apr 2026
paudelnirajanColdTools500M32K

general-kd-Qwen2.5-0.5B-Instruct-ber-5000-5000

0
·
4
·
Apr 2026
jordanpainterColdTools8B32K

diallm-llama-dpo-all

0
·
4
·
Apr 2026
xw1234ganColdTools8B32K

Main_fixed_MATH_7B_step_8

0
·
4
·
Apr 2026
jordanpainterColdTools8B32K

diallm-qwen-dpo-aus

0
·
4
·
Apr 2026
lihaoxin2020ColdTools4B32K

qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50

0
·
4
·
Apr 2026
open-sciColdTools2B32K

sft__ot30k_Qwen3-1.7B-Base-SFT-Tulu3-decontaminated

0
·
4
·
Apr 2026
kmseongCold7B4K

llama2_7b-chat-Safety-FT-lr5e-5

0
·
4
·
Apr 2026
yufeng1ColdTools8B32K

OpenThinker-7B-type6-e5-max-b64-alpha0_28125

0
·
4
·
Apr 2026
open-sciColdTools2B32K

sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated

0
·
4
·
Apr 2026
nomadicsynthColdTools3B32K

Qwen2.5-3B-Instruct-Reasoning-gsm8k-v1

0
·
4
·
Mar 2025
tusherbhomikColdTools2B32K

qwen2.5-1.5b-hgr-5340-r2

0
·
4
·
May 2026
jinrui123ColdTools3B32K

llamasrnn-grpo-epoch001-merged

0
·
4
·
Apr 2026
jordanpainterColdTools8B32K

diallm-qwen-dpo-all

0
·
4
·
Apr 2026
sstoica12ColdTools8B32K

acquisition_llama-3_1-8b_bins_numina_format

0
·
4
·
Apr 2026
tzwilliam0ColdTools4B32K

qwen-dapo-17k-vr-7

0
·
4
·
Apr 2026