rinna/llama-3-youko-8b
rinna/llama-3-youko-8b is an 8 billion parameter language model developed by rinna, continually pre-trained from Meta-Llama-3-8B. This model is specifically optimized for Japanese language tasks, having been trained on an additional 22 billion tokens from a mixture of Japanese and English datasets. It significantly enhances performance on Japanese benchmarks compared to its base model, making it suitable for applications requiring strong Japanese language understanding and generation.
Loading preview...
Overview
rinna/llama-3-youko-8b is an 8 billion parameter language model that builds upon Meta-Llama-3-8B through a process of continual pre-training. Developed by rinna, this model has undergone additional training on approximately 22 billion tokens, incorporating a diverse mix of Japanese and English datasets. This strategic pre-training aims to substantially improve the model's proficiency and performance in Japanese language tasks.
Key Capabilities
- Enhanced Japanese Language Performance: Achieves significant improvements in Japanese task performance due to extensive continual pre-training on Japanese corpora.
- Base Architecture: Utilizes the robust 32-layer, 4096-hidden-size transformer architecture of the Llama 3 family.
- Training Data: Continually trained on a blend of Japanese CC-100, Japanese C4, Japanese OSCAR, The Pile, Wikipedia, and rinna's curated Japanese datasets.
- Tokenizer: Employs the original Meta-Llama-3-8B tokenizer.
Good For
- Applications requiring strong Japanese language understanding and generation.
- Developers looking for a Llama 3-based model with optimized performance in Japanese contexts.
- Research and development in multilingual NLP, particularly focusing on Japanese.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.