DiscoResearch/DiscoLM_German_7b_v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 14, 2024License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

DiscoLM German 7b v1 is a 7 billion parameter, Mistral-based large language model developed by DiscoResearch, specifically optimized for German-language applications. It was fine-tuned using SFT and DPO on a large dataset of German and English instructions, excelling in German text understanding, generation, and translation tasks. While maintaining English fluency, its primary strength lies in providing robust and reliable German language output for everyday use cases.

Loading preview...

DiscoLM German 7b v1: German-Optimized LLM

DiscoLM German 7b v1 is a Mistral-based 7 billion parameter large language model developed by DiscoResearch, building upon the EM German model family. Its core focus is on German-language applications, having undergone supervised fine-tuning (SFT) and DPO reinforcement learning on extensive German and English instruction datasets.

Key Capabilities

  • High German Proficiency: Optimized for understanding, generating, and interacting with German language content.
  • Multilingual Fluency: Preserves strong fluency in English and excels at translation tasks.
  • Instruction Following: Trained on a diverse set of instructions for robust performance in various conversational and task-oriented scenarios.
  • ChatML Support: Uses ChatML for prompt formatting, ensuring compatibility with OpenAI endpoints and most inference libraries.
  • Optional Retrieval Format: Includes a special retrieval format to enhance steerability and reduce hallucinations in RAG applications.
  • Experimental Function Calling: Supports structured outputs and function calling, with ongoing improvements planned.

Good For

  • German-centric Applications: Ideal for use cases requiring high-quality German text generation and comprehension.
  • Everyday Conversational AI: Designed as a reliable alternative to proprietary models for general German language interaction.
  • Translation Tasks: Demonstrates strong performance in translating between German and English.
  • RAG Systems: The optional retrieval format can be beneficial for reducing hallucinations in retrieval-augmented generation.

While not primarily aimed at beating benchmarks, preliminary MT Bench results for German indicate strong performance, particularly in reasoning, often comparable to or surpassing GPT-3.5-turbo in specific categories. The model prioritizes perceived language quality for native German speakers over raw benchmark scores.