mayflowergmbh/Wiedervereinigung-7b-dpo-laser

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 28, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The mayflowergmbh/Wiedervereinigung-7b-dpo-laser is a 7 billion parameter German language model, merged from several Mistral-based German models including LeoLM/leo-mistral-hessianai-7b. It has been further refined using DPO training with a German translation of intel-orca-dpo and a laserRMT treatment with German datasets. This model is specifically optimized for high-quality German language generation and understanding, making it suitable for applications requiring nuanced German text processing.

Loading preview...

Wiedervereinigung-7b-dpo-laser: A German-Optimized 7B Model

The Wiedervereinigung-7b-dpo-laser is a 7 billion parameter language model developed by mayflowergmbh, specifically engineered for superior performance in the German language. This model is a strategic merge of several high-performing German Mistral-based models, including DiscoResearch/DiscoLM_German_7b_v1, DRXD1000/Phoenix, VAGOsolutions/SauerkrautLM-7b-v1-mistral, and malteos/hermeo-7b, with LeoLM/leo-mistral-hessianai-7b serving as the base.

Key Optimizations and Features

  • Merged Architecture: Utilizes a dare_ties merge method to combine the strengths of multiple German-centric Mistral models.
  • DPO Training: Enhanced through Direct Preference Optimization (DPO) using a German translation of the intel-orca-dpo dataset, improving response quality and alignment.
  • laserRMT Treatment: Further refined with laserRMT using German datasets, contributing to its specialized German language capabilities.
  • German Language Focus: Designed from the ground up to excel in German text generation, comprehension, and nuanced communication.

Performance Insights

Preliminary mt-bench-de evaluations indicate strong performance across various categories, particularly in:

  • Humanities: 9.325
  • STEM: 8.775
  • Writing: 8.425
  • Roleplay: 8.025

These scores suggest the model is well-suited for tasks requiring detailed and contextually appropriate German responses, especially in creative, academic, and conversational applications. The model's unique merging and training approach positions it as a robust option for German-specific NLP tasks.