rinna/youri-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 30, 2023License:llama2Architecture:Transformer0.0K Open Weights Warm

rinna/youri-7b is a 7 billion parameter transformer-based causal language model developed by rinna, continually pre-trained from Llama-2-7b on 40 billion tokens of mixed Japanese and English datasets. This model is specifically optimized for Japanese language tasks, significantly improving performance in this domain. It features a 4096-token context length and utilizes the original Llama-2 tokenizer.

Loading preview...

rinna/youri-7b: Japanese-Optimized Llama-2 Continual Pre-training

rinna/youri-7b is a 7-billion parameter language model developed by rinna, built upon the Llama-2-7b architecture. Its primary distinction lies in its continual pre-training on approximately 40 billion tokens from a diverse mixture of Japanese and English datasets, including Japanese CC-100, C4, OSCAR, The Pile, Wikipedia, and rinna's curated Japanese dataset. This extensive training significantly enhances its performance on Japanese language tasks.

Key Capabilities & Features

  • Japanese Language Proficiency: Substantially improved capabilities for Japanese text generation and understanding due to specialized continual pre-training.
  • Llama-2 Foundation: Inherits the robust 32-layer, 4096-hidden-size transformer architecture of Llama-2-7b.
  • Standard Tokenization: Utilizes the original Llama-2 tokenizer.
  • Benchmarking: Performance metrics are available on rinna's LM benchmark page and the Open LLM Leaderboard.

Ideal Use Cases

  • Japanese NLP Applications: Recommended for tasks requiring strong Japanese language generation, comprehension, and translation.
  • Research & Development: Suitable for researchers and developers exploring multilingual LLMs, particularly those focusing on Japanese language models.