DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1
DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1 is an 8 billion parameter instruction-tuned language model developed by DiscoResearch and Occiglot, with support from DFKI and hessian.Ai. Derived from Meta's Llama3-8B, it underwent continuous pretraining on 65 billion high-quality German tokens and was fine-tuned on a German instruction dataset. This model excels in German language understanding and generation, offering strong performance across various German benchmarks with an 8192 token context length.
Loading preview...
Llama3-DiscoLeo-Instruct-8B-v0.1 Overview
This model is an 8 billion parameter instruction-tuned variant, a collaborative effort by DiscoResearch and Occiglot, supported by DFKI and hessian.Ai. It builds upon Meta's Llama3-8B, having undergone extensive continuous pretraining on 65 billion high-quality German tokens, similar to established LeoLM and Occiglot models. The instruction tuning phase utilized a dedicated German instruction dataset developed by DiscoResearch.
Key Capabilities & Features
- Optimized for German Language: Continuously pretrained on a massive German token dataset, making it highly proficient in German understanding and generation.
- Instruction-Tuned: Fine-tuned on a specific German instruction dataset for improved conversational and task-oriented performance.
- Llama-3 Chat Template: Utilizes the standard Llama-3 chat template, ensuring compatibility and ease of use with
transformerslibrary's chat templating. - Strong German Benchmark Performance: Achieves a mean score of 0.60552 across a suite of German and English benchmarks, outperforming Meta-Llama-3-8B-Instruct in several German-specific evaluations like
truthful_qa_deandarc_challenge_de. - 8192 Token Context Length: Supports a substantial context window for processing longer inputs and generating more coherent responses.
When to Use This Model
- German Language Applications: Ideal for chatbots, content generation, summarization, and question-answering systems requiring high proficiency in German.
- Instruction-Following Tasks: Suited for applications where the model needs to adhere to specific instructions and generate structured outputs.
- Research and Development: A valuable resource for researchers focusing on German NLP and evaluating instruction-tuned models in a multilingual context.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.