mlabonne/UltraMerge-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 21, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

UltraMerge-7B is an experimental 7 billion parameter DPO fine-tune of automerger/YamShadow-7B, developed by mlabonne. This model is trained on a diverse set of DPO datasets including mlabonne/truthy-dpo-v0.1 and mlabonne/ultrafeedback-binarized-preferences-cleaned, making it suitable for general-purpose conversational AI tasks. It features an 8192 token context length, offering robust performance for applications requiring extended conversational memory.

Loading preview...

UltraMerge-7B Overview

UltraMerge-7B is an experimental 7 billion parameter language model developed by mlabonne. It is a Direct Preference Optimization (DPO) fine-tune of the automerger/YamShadow-7B base model. This model leverages a combination of high-quality DPO datasets to enhance its conversational and instruction-following capabilities.

Key Capabilities

  • DPO Fine-tuning: Utilizes several DPO datasets, including mlabonne/truthy-dpo-v0.1, mlabonne/distilabel-intel-orca-dpo-pairs, mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha, and mlabonne/ultrafeedback-binarized-preferences-cleaned, to improve response quality and alignment.
  • Base Model: Built upon automerger/YamShadow-7B, providing a strong foundation for general language understanding and generation.
  • Context Length: Supports an 8192 token context window, allowing for more extensive and coherent interactions.

Good For

  • General-purpose conversational AI: Its diverse DPO training makes it suitable for a wide range of chat and instruction-following applications.
  • Experimentation with DPO models: Ideal for researchers and developers interested in exploring the effects of DPO fine-tuning on a 7B parameter model.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p