Name: DiscoResearch/Llama3-DiscoLeo-Instruct-8B-32k-v0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DiscoResearch

Model Overview

DiscoResearch/Llama3-DiscoLeo-Instruct-8B-32k-v0.1 is an instruction-tuned variant of the Llama3-8B architecture, developed through a collaboration between DiscoResearch and Occiglot, with contributions from DFKI and hessian.Ai. It builds upon a base model continuously pretrained on 65 billion high-quality German tokens, similar to LeoLM and Occiglot models. A key feature is its extended context window, achieved by training on an additional 100 million tokens at a 32k context length, utilizing a rope_theta value of 1.5e6.

Key Capabilities

German Language Proficiency: Continuously pretrained on a vast corpus of German tokens, making it highly capable for German-centric tasks.
Extended Context Window: Supports a 32k context length, enabling processing and generation of longer texts.
Instruction Following: Fine-tuned on a dedicated German instruction dataset, enhancing its ability to understand and execute complex instructions.
Llama-3 Chat Template: Utilizes the standard Llama-3 chat template for easy integration with transformers chat templating.

Performance

Evaluated against common English and German benchmarks (GermanBench), the model demonstrates strong performance, particularly in German-specific metrics. It achieves a mean score of 0.60547 across various benchmarks, showing competitive results against Meta's Llama-3-8B-Instruct and other DiscoResearch models.

Good For

Applications requiring robust German language understanding and generation.
Tasks benefiting from a large context window, such as summarizing long documents or engaging in extended conversations in German.
Instruction-following tasks where precise responses to German prompts are critical.