Japanese-StableLM-Base-Beta-70B Overview

japanese-stablelm-base-beta-70b is a 70-billion parameter, decoder-only language model developed by Stability AI Japan. It is built upon the Llama-2-70b architecture and has undergone extensive fine-tuning using a diverse collection of Japanese datasets. The primary goal of this model is to achieve high performance on Japanese language tasks.

Key Characteristics

Architecture: Based on the robust Llama-2-70b transformer architecture.
Language Focus: Specifically fine-tuned for the Japanese language.
Training Data: Continued pre-training on roughly 100 billion tokens from corpora such as Japanese/English Wikipedia, Japanese mc4, Japanese CC-100, Japanese OSCAR, and SlimPajama (excluding Books3).
License: Released under the Llama2 Community License, allowing for broad use.

Intended Use Cases

This model is designed as a foundational base model, suitable for:

Application-specific fine-tuning: Developers can further fine-tune this model for their particular Japanese language applications without strict commercial limitations.
Japanese language processing: Excels in tasks requiring strong Japanese language understanding and generation capabilities.

For instruction-following capabilities, users are directed to the Japanese-StableLM-Instruct-Beta-70B model. Smaller 7B versions and models optimized for faster inference (e.g., Japanese-StableLM-Base-JA_Vocab-Beta-7B) are also available.

Overview

Japanese-StableLM-Base-Beta-70B Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)