Models

371
800M32Kqwen3-0b6
Cold

albertfares/MNLP_SFT_DPO

0
·
6
8B32Kqwen25-7b
Cold

AmberYifan/Qwen2.5-7B-Instruct-userfeedback-iter2

0
·
4
7B4Kmistral-v01-7b
Cold

VAGOsolutions/FC-SauerkrautLM-7b-beta

13
·
4
·
Feb 2024
8B32Kqwen3-8b
Cold

linius/Qwen3-8B-SPoT

2
·
3
·
Mar 2026
8B32Kqwen25-7b
Cold

AmberYifan/Qwen2.5-7B-Instruct-userfeedback-iter1

0
·
1
8B32Kqwen25-7b
Cold

AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-11k

0
·
1
7B4Kllama2-7b
Cold

tsavage68/chat_1000STEPS_1e6_05beta_DPO

0
·
1
·
Feb 2024
7B4Kllama2-7b
Cold

tsavage68/chat_200STEPS_1e6_01beta

0
·
1
·
Feb 2024
7B4Kmistral-v01-7b
Cold

weqweasdas/zephyr-7b-dpo-full

0
·
1
·
Apr 2024
7B4Kmistral-v01-7b
Cold

wxzhang/dpo-selective-buffer-spo-shift

0
·
0
8B32Kqwen2-7b
Cold

makotonlo/LLM2026_DPO_SFT19_v18

0
·
0
·
Mar 2026