rinna/llama-3-youko-8b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 1, 2024License:llama3Architecture:Transformer0.1K Warm

rinna/llama-3-youko-8b is an 8 billion parameter language model developed by rinna, continually pre-trained from Meta-Llama-3-8B. This model is specifically optimized for Japanese language tasks, having been trained on an additional 22 billion tokens from a mixture of Japanese and English datasets. It significantly enhances performance on Japanese benchmarks compared to its base model, making it suitable for applications requiring strong Japanese language understanding and generation.

Loading preview...

Overview

rinna/llama-3-youko-8b is an 8 billion parameter language model that builds upon Meta-Llama-3-8B through a process of continual pre-training. Developed by rinna, this model has undergone additional training on approximately 22 billion tokens, incorporating a diverse mix of Japanese and English datasets. This strategic pre-training aims to substantially improve the model's proficiency and performance in Japanese language tasks.

Key Capabilities

  • Enhanced Japanese Language Performance: Achieves significant improvements in Japanese task performance due to extensive continual pre-training on Japanese corpora.
  • Base Architecture: Utilizes the robust 32-layer, 4096-hidden-size transformer architecture of the Llama 3 family.
  • Training Data: Continually trained on a blend of Japanese CC-100, Japanese C4, Japanese OSCAR, The Pile, Wikipedia, and rinna's curated Japanese datasets.
  • Tokenizer: Employs the original Meta-Llama-3-8B tokenizer.

Good For

  • Applications requiring strong Japanese language understanding and generation.
  • Developers looking for a Llama 3-based model with optimized performance in Japanese contexts.
  • Research and development in multilingual NLP, particularly focusing on Japanese.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p