Models

37,258
500M32Kqwen2-0b5
Cold

DADA121/qwen2.5-0.5b-bigmath-grpo-merged

0
·
299
·
Apr 2026
8B32Kqwen3-8b
Cold

jordanpainter/diallm-qwen-dpo-all

0
·
299
·
Apr 2026
2B32Kqwen2-1b5
Cold

abhinav0231/Qwen2.5-1.5B-reasoning-warmup

0
·
299
·
Apr 2026
3B8Kgemma-2b
Cold

eekay/gemma-2b-it-noised-np0.15-emb

0
·
299
·
Apr 2026
1B32Kgemma3t-1b
Cold

NotoriousH2/gemma-3-1b-it_Math_SFT

0
·
299
·
Apr 2026
800M32Kqwen3-0b6
Cold

LorenaYannnnn/Qwen3-0.6B-g_general_reward-seed_0-sky_r_weak_syco

0
·
299
·
Apr 2026
2B32Kqwen2-1b5
Cold

xw1234gan/Main_fixed_MATH_1_5B_BaseAnchor_step_7

0
·
298
·
Apr 2026
2B32Kqwen2-1b5
Cold

bosco999/qwen-bc-base

0
·
298
·
Apr 2026
8B32Kllama31-8b
Cold

Alelcv27/Llama3.1-8B-Base-Code

0
·
298
·
Apr 2026
2B32Kqwen2-1b5
Cold

Kyleyee/rDPO_hh-seed5

0
·
298
·
Apr 2026
8B32Kqwen2-7b
Cold

SapphireGaze429/opensecops-qwen2.5-7b-grpo

0
·
298
·
Apr 2026
2B32Kqwen2-1b5
Cold

WhipStudio/Qwen2.5-1.5B-Instruct-ForgeArena-Overseer

0
·
298
·
Apr 2026
8B32Kqwen3-8b
Cold

ccui46/cookingworld_per_chunk_act_q3_tokfix_diffPrompt_lowerLR_tformerPin_1000

0
·
298
·
Apr 2026
2B32Kqwen3-1b7
Cold

anurag203/clarify-rl-run4-qwen3-1.7b-beta0.2

0
·
298
·
Apr 2026
8B32Kllama31-8b
Cold

violetxi/sft_tir_rl_prep_Llama_lr0.0001_bs32_wd0.0_wp0.3_checkpoint-epoch0

0
·
297
8B32Kllama31-8b
Cold

jeongseokoh/llama3.1_8b_sft-vanilla

0
·
297
·
Mar 2026
3B32Kqwen25-3b
Cold

xw1234gan/GRPO_KL_Qwen2.5-3B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN

0
·
297
·
Apr 2026
1B32Kgemma3t-1b
Cold

David0132/gemma-upd-qwen8b

0
·
297
·
Apr 2026
9B16Kgemma2-9b
Cold

arunasank/iahvbzve

0
·
297
·
Apr 2026
4B32Kqwen3-4b
Cold

ertghiu256/Qwen3-4b-2507-Thinking-math-and-code

1
·
297
·
Oct 2025