DiscoResearch/Llama3-German-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 23, 2024License:llama3Architecture:Transformer0.0K Warm
DiscoResearch/Llama3-German-8B is an 8 billion parameter large language model based on Meta's Llama3-8B, continuously pretrained on 65 billion high-quality German tokens. Developed by DiscoResearch and Occiglot, it significantly improves German linguistic capabilities and reasoning, particularly on the Hellaswag benchmark, while maintaining English performance. This model is specialized for German language tasks, addressing Llama3's suboptimal German performance due to limited multilingual training data. It is intended as a base model for further fine-tuning for German-specific applications.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
top_p
top_k
–
frequency_penalty
presence_penalty
repetition_penalty
–
min_p
–