stabilityai/japanese-stablelm-instruct-beta-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 30, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

The stabilityai/japanese-stablelm-instruct-beta-7b is a 7 billion parameter decoder-only language model developed by Stability AI Japan, based on the Llama2 transformer architecture. It is fine-tuned on Japanese translated versions of Databricks Dolly-15k, Anthropic HH, and other public datasets. This model is specifically designed for instruction-following tasks in Japanese, offering a specialized solution for applications requiring high-quality Japanese language generation and understanding.

Loading preview...

Japanese-StableLM-Instruct-Beta-7B Overview

japanese-stablelm-instruct-beta-7b is a 7 billion parameter instruction-tuned language model developed by Stability AI Japan. It is built upon the Llama2 transformer architecture and is a fine-tuned version of japanese-stablelm-base-beta-7b.

Key Capabilities & Features

  • Japanese Language Focus: Specifically designed and fine-tuned for instruction-following tasks in Japanese.
  • Instruction-Tuned: Enhanced for conversational and instruction-based interactions through fine-tuning on Japanese translated datasets including Anthropic HH-RLHF, Databricks Dolly 15k, and OpenAssistant Conversations Dataset.
  • Llama2 Architecture: Leverages the robust Llama2 transformer architecture for its foundational capabilities.
  • Scalability: A larger 70B parameter version (japanese-stablelm-instruct-beta-70b) and a faster version with a specialized tokenizer (japanese-stablelm-instruct-ja_vocab-beta-7b) are also available.

Intended Use Cases

  • Foundation Model: Suitable as a base for application-specific fine-tuning without strict commercial use limitations under the Llama2 Community License.
  • Japanese NLP Applications: Ideal for developers building applications that require high-quality Japanese text generation, summarization, or conversational AI.

Limitations

  • The model may reflect biases or inappropriate content present in its pre-training datasets, necessitating caution in production environments.