Models

372
500M32Kqwen2-0b5
Warm

nyeinchanaung/a5_dpo_qwen2

0
·
1
500M32Kqwen2-0b5
Warm

qgallouedec/Qwen2-0.5B-OnlineDPO-AutoRM

0
·
1
500M32Kqwen2-0b5
Warm

maars505/trained-qwen2-dpo-model2

0
·
1
500M32Kqwen2-0b5
Warm

qgallouedec/online-dpo-qwen2-2

0
·
1
500M32Kqwen2-0b5
Warm

araziziml/Qwen2-0.5B-DPO

0
·
1
3B8Kgemma2-2b
Warm

akashgoel-id/En_RP_DPO-gemma2_2b_64X32_test

0
·
1
3B32Kllama32-3b
Warm

CriteriaPO/llama3.2-3b-dpo-finegrained

0
·
1
·
May 2025
8B32Kllama31-8b
Warm

mlfoundations-dev/simpo-evol_tt_5s

0
·
0
8B32Kllama31-8b
Warm

mlfoundations-dev/simpo-oh_teknium_scaling_down_random_0.4

0
·
0
8B32Kllama31-8b
Warm

mlfoundations-dev/simpo-oh-dcft-v1.3_no-curation_gpt-4o-mini_scale_8x

0
·
0
1B32Kllama32-1b
Warm

jessemeng/TwinLlama-3.1-8B-DPO

0
·
0
8B32Kllama31-8b
Cold

heipah/TwinLlama-3.1-8B-DPO

0
·
2k
·
Mar 2026
2B32Kqwen2-1b5
Cold

chenyongxi/Qwen2.5-1.5B-DPO-1.5B

0
·
1k
·
Apr 2026
8B32Kllama31-8b
Cold

FlorianJK/Meta-Llama-3.1-8B-SecAlign-pp-Flex-Merged

0
·
1k
·
Mar 2026
7B8Kmistral-v02-7b
Cold

mlabonne/NeuralMarcoro14-7B

39
·
1k
·
Jan 2024
4B32Kqwen3-4b
Cold

CEIA-RL/qwen3-4b-dw-lr-hf-dpo

0
·
1k
·
Apr 2026
8B32Kqwen3-8b
Cold

RISys-Lab/RedSage-Qwen3-8B-DPO

5
·
841
·
Oct 2025
8B32Kqwen2-7b
Cold

intrect/VELA

2
·
703
·
Jan 2026
4B4Kphi3-4b
Cold

MaziyarPanahi/calme-2.3-phi3-4b

9
·
664
·
May 2024
7B8Kmistral-v02-7b
Cold

eren23/OGNO-7b-dpo-truthful

1
·
603
·
Feb 2024