phi2-3b
marcel/phi-2-openhermes-30k
Jan 2024
0
108
qwen3-4b
mlxha/Qwen3-4B-grpo-medmcqa
May 2025
2
57
qwen15-0b5
FreedomIntelligence/Apollo-0.5B
Mar 2024
3
341
qwen3-14b
JetBrains-Research/Qwen3-14B-am
53
StanfordAIMI/RadPhi-2
1
315
qwen3-8b
DragonLLM/Qwen-Open-Finance-R-8B
Oct 2025
6
314
llama31-8b
DragonLLM/Llama-Open-Finance-8B
14
5,190
tinyllama-1b1
VishalMysore/cookgptlama
Dec 2023
6,309
gemma3-4b
unsloth/medgemma-1.5-4b-it
Jan 2026
5
5,170
mistral-nemo
DavidAU/Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC
1,029
qwen3-1b7
MultiRL/qwen3_1.7b_sudoku_one_act_new
30
sagnikM/grpo_sgd_qwen3-8b_3k_seqlen_momentum_0p9_1e-2
johnceballos/Affine-std-5F53PDhPD9wr3utc1x5E3sLNHT68wPMDHHSKB33iEap36Dxs
88
huseyinatahaninan/appworld_distillation_sft_v2-SFT-Qwen3-8B
41
MultiRL/qwen3_1.7b_rush_hour_one_move_sft
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_592
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_296
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_888
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_1184
26
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_6__global_step_1480
45