stabilityai/japanese-stablelm-base-beta-70b

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Oct 30, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

Japanese-StableLM-Base-Beta-70B is a 70 billion parameter decoder-only language model developed by Stability AI Japan, based on the Llama-2-70b architecture. It has been fine-tuned on approximately 100 billion tokens of diverse Japanese data, including Wikipedia, mc4, CC-100, OSCAR, and SlimPajama. This model is specifically optimized for maximizing downstream performance on Japanese language tasks, serving as a foundational model for application-specific fine-tuning.

Loading preview...

Japanese-StableLM-Base-Beta-70B Overview

japanese-stablelm-base-beta-70b is a 70-billion parameter, decoder-only language model developed by Stability AI Japan. It is built upon the Llama-2-70b architecture and has undergone extensive fine-tuning using a diverse collection of Japanese datasets. The primary goal of this model is to achieve high performance on Japanese language tasks.

Key Characteristics

  • Architecture: Based on the robust Llama-2-70b transformer architecture.
  • Language Focus: Specifically fine-tuned for the Japanese language.
  • Training Data: Continued pre-training on roughly 100 billion tokens from corpora such as Japanese/English Wikipedia, Japanese mc4, Japanese CC-100, Japanese OSCAR, and SlimPajama (excluding Books3).
  • License: Released under the Llama2 Community License, allowing for broad use.

Intended Use Cases

This model is designed as a foundational base model, suitable for:

  • Application-specific fine-tuning: Developers can further fine-tune this model for their particular Japanese language applications without strict commercial limitations.
  • Japanese language processing: Excels in tasks requiring strong Japanese language understanding and generation capabilities.

For instruction-following capabilities, users are directed to the Japanese-StableLM-Instruct-Beta-70B model. Smaller 7B versions and models optimized for faster inference (e.g., Japanese-StableLM-Base-JA_Vocab-Beta-7B) are also available.