CultriX/Wernicke-7B-dpo
CultriX/Wernicke-7B-dpo is a 7 billion parameter language model developed by CultriX, fine-tuned using Direct Preference Optimization (DPO) with the truthy-dpo dataset. This model is a merge of FelixChao/WestSeverus-7B-DPO-v2, CultriX/Wernicke-7B-v8, and vanillaOVO/supermario_v2, utilizing a DARE TIES merge method. It is designed for general text generation tasks, leveraging the strengths of its constituent models.
Loading preview...
CultriX/Wernicke-7B-dpo: DPO Fine-tuned Merge Model
CultriX/Wernicke-7B-dpo is a 7 billion parameter language model developed by CultriX, specifically fine-tuned using Direct Preference Optimization (DPO) on the truthy-dpo dataset. This model is built upon the CultriX/Wernicke-7B-v9 base, which itself is a sophisticated merge of several distinct models.
Key Capabilities & Architecture
The model's architecture is a result of merging three foundational models using the DARE TIES method via LazyMergekit:
- FelixChao/WestSeverus-7B-DPO-v2: Contributes a significant DPO-tuned base.
- CultriX/Wernicke-7B-v8: An earlier iteration from CultriX, providing a strong foundation.
- vanillaOVO/supermario_v2: Adds further diverse capabilities to the merge.
The DPO fine-tuning process aims to align the model's outputs more closely with human preferences, enhancing its ability to generate high-quality, preferred responses.
Usage & Configuration
This model is configured with int8_mask enabled and uses float16 for its dtype, optimizing for performance and memory efficiency. Developers can easily integrate it using the Hugging Face transformers library for text generation tasks, applying chat templates for structured conversations. The DPO fine-tuning makes it particularly suitable for applications where nuanced and preferred responses are critical.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.