mlabonne/UltraMerge-7B
UltraMerge-7B is an experimental 7 billion parameter DPO fine-tune of automerger/YamShadow-7B, developed by mlabonne. This model is trained on a diverse set of DPO datasets including mlabonne/truthy-dpo-v0.1 and mlabonne/ultrafeedback-binarized-preferences-cleaned, making it suitable for general-purpose conversational AI tasks. It features an 8192 token context length, offering robust performance for applications requiring extended conversational memory.
Loading preview...
UltraMerge-7B Overview
UltraMerge-7B is an experimental 7 billion parameter language model developed by mlabonne. It is a Direct Preference Optimization (DPO) fine-tune of the automerger/YamShadow-7B base model. This model leverages a combination of high-quality DPO datasets to enhance its conversational and instruction-following capabilities.
Key Capabilities
- DPO Fine-tuning: Utilizes several DPO datasets, including
mlabonne/truthy-dpo-v0.1,mlabonne/distilabel-intel-orca-dpo-pairs,mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha, andmlabonne/ultrafeedback-binarized-preferences-cleaned, to improve response quality and alignment. - Base Model: Built upon
automerger/YamShadow-7B, providing a strong foundation for general language understanding and generation. - Context Length: Supports an 8192 token context window, allowing for more extensive and coherent interactions.
Good For
- General-purpose conversational AI: Its diverse DPO training makes it suitable for a wide range of chat and instruction-following applications.
- Experimentation with DPO models: Ideal for researchers and developers interested in exploring the effects of DPO fine-tuning on a 7B parameter model.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.