stabilityai/japanese-stablelm-base-beta-70b
Japanese-StableLM-Base-Beta-70B is a 70 billion parameter decoder-only language model developed by Stability AI Japan, based on the Llama-2-70b architecture. It has been fine-tuned on approximately 100 billion tokens of diverse Japanese data, including Wikipedia, mc4, CC-100, OSCAR, and SlimPajama. This model is specifically optimized for maximizing downstream performance on Japanese language tasks, serving as a foundational model for application-specific fine-tuning.
Loading preview...
Japanese-StableLM-Base-Beta-70B Overview
japanese-stablelm-base-beta-70b is a 70-billion parameter, decoder-only language model developed by Stability AI Japan. It is built upon the Llama-2-70b architecture and has undergone extensive fine-tuning using a diverse collection of Japanese datasets. The primary goal of this model is to achieve high performance on Japanese language tasks.
Key Characteristics
- Architecture: Based on the robust Llama-2-70b transformer architecture.
- Language Focus: Specifically fine-tuned for the Japanese language.
- Training Data: Continued pre-training on roughly 100 billion tokens from corpora such as Japanese/English Wikipedia, Japanese mc4, Japanese CC-100, Japanese OSCAR, and SlimPajama (excluding Books3).
- License: Released under the Llama2 Community License, allowing for broad use.
Intended Use Cases
This model is designed as a foundational base model, suitable for:
- Application-specific fine-tuning: Developers can further fine-tune this model for their particular Japanese language applications without strict commercial limitations.
- Japanese language processing: Excels in tasks requiring strong Japanese language understanding and generation capabilities.
For instruction-following capabilities, users are directed to the Japanese-StableLM-Instruct-Beta-70B model. Smaller 7B versions and models optimized for faster inference (e.g., Japanese-StableLM-Base-JA_Vocab-Beta-7B) are also available.