DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 19, 2024Architecture:Transformer0.0K Warm

Llama 3 DiscoLM German 8b v0.1 Experimental is an 8 billion parameter Llama 3 based experimental language model developed by DiscoResearch. This model is a German-focused version of DiscoLM German, specifically designed for German language processing. It utilizes the ChatML prompt format for compatibility and steerability, making it suitable for applications requiring a German-centric LLM. The model is currently an experimental release, intended for development and not production use.

Loading preview...

Llama 3 DiscoLM German 8b v0.1 Experimental Overview

This model, developed by DiscoResearch, is an experimental 8 billion parameter Llama 3 based language model specifically adapted for the German language. It builds upon the original DiscoLM German model, aiming to provide enhanced German language capabilities. The model is currently in an experimental phase, indicating ongoing development and future improvements.

Key Features & Capabilities

  • Llama 3 Architecture: Leverages the Llama 3 base for its underlying language understanding and generation capabilities.
  • German Language Focus: Optimized for processing and generating text in German, making it suitable for German-specific applications.
  • ChatML Prompt Format: Employs the ChatML format, ensuring compatibility with OpenAI endpoints and various inference libraries, and allowing for steerable system prompts.
  • Experimental Release: This is a development version, not intended for production, with continuous updates planned.

Important Considerations

  • Known Issue: The model currently generates random reserved special tokens at the end of outputs; users should employ skip_special_tokens=true during decoding.
  • Limitations: Like other LLMs, it can produce factually incorrect or biased content. Users are responsible for implementing safety and moderation layers.
  • License: Distributed under the META LLAMA 3 COMMUNITY LICENSE.

This model is a collaborative effort by JP Harries, Björn Plüster, and Daniel Auras from DiscoResearch, with sponsorship from ellamind and compute resources from sysGen GmbH.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p