Models

32,735
8B32Kqwen3-8b
Cold

laion/bugs-r2egym-stackseq

0
·
6
·
Dec 2025
8B32Kqwen3-8b
Cold

Sinestro38/verl_grpo_numina_qwen3_8b_sgdLR1e-1_beta0_bs256_in1024_out1024

0
·
6
·
Dec 2025
8B32Kqwen3-8b
Cold

bespokelabs/Qwen3-8B-ot_step50_high

0
·
6
·
Dec 2025
8B32Kllama31-8b
Cold

HiTZ/gl_Llama-3.1-8B

0
·
6
·
Dec 2025
8B32Kllama31-8b
Cold

Jackrong/Llama-3.1-8B-Think-Zero-GRPO

0
·
6
70B32Kllama31-70b
Cold

AlignmentResearch/dolus_chat_sdf_Llama-3.3-70B-Instruct_v1_merged

0
·
6
·
Dec 2025
13B4Kllama2-13b
Cold

usr256864/ee_gol_grpo_13

0
·
6
·
Jan 2026
9B16Kgemma2-9b
Cold

kyungeun/gemma-2-9b-it-mathinstruct-dpo

4
·
6
·
Jul 2024
8B32Kllama31-8b
Cold

jl3676/HarmReporter

2
·
6
·
Oct 2024
32B32Kqwen3-32b
Cold

Nithish2410/ft-msm-g3-Q3-32B-wo-think-sft

0
·
6
·
Jan 2026
8B32Kllama31-8b
Cold

koutch/short_paper_llama_llama3.1-8b_train_sft_train_no_think

0
·
6
·
Jan 2026
4B32KVisiongemma3-4b
Cold

DrRiceIO7/Gemma3-4B-CoT

1
·
6
·
Dec 2025
27B32Kgemma2-27b
Cold

Aratako/Llama-Gemma-2-27b-ORPO-iter3

1
·
6
·
Dec 2024
8B32Kllama31-8b
Cold

NextGLab/ORANSight_LLama_8B_Instruct

0
·
6
·
Dec 2024
8B8Kllama3-8b
Cold

OPTML-Group/SimNPO-WMDP-llama3-8b-instruct

0
·
6
·
Aug 2025
24B32Kmistral-24b
Cold

ailexleon/Cydonia-R1-24B-v4.1-mlx-fp16

0
·
6
·
Dec 2025
12B32Kmistral-nemo
Cold

Dogoo3/Aletheia-12B

3
·
6
·
Jan 2026
8B32Kqwen2-7b
Cold

yufeng1/R1-Distill-Qwen-7B-reasoning-full-lora-type3-e5

0
·
6
·
Oct 2025
70B32Kllama31-70b
Cold

sandbagging-games/tarun

0
·
6
·
Oct 2025
4B32Kqwen3-4b
Cold

Pitch-deck/affine-5CPkTkngzQdwS2gZpd4fAwF2avA2Y9MRVUGQVZyBF88E2uGg

0
·
6
·
Jan 2026