rinna/nekomata-7b-instruction

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:32kPublished:Dec 19, 2023License:otherArchitecture:Transformer0.0K Cold

rinna/nekomata-7b-instruction is a 7 billion parameter instruction-tuned causal language model developed by rinna, based on a 32-layer, 4096-hidden-size transformer architecture. It adopts the Alpaca input format and is fine-tuned on a diverse dataset including Japanese and English instruction data. This model is designed for general instruction-following tasks, particularly excelling in contexts requiring Japanese language understanding and generation.

Loading preview...

Model Overview

rinna/nekomata-7b-instruction is an instruction-tuned variant of the rinna/nekomata-7b base model, developed by rinna. It is built upon a 32-layer, 4096-hidden-size transformer architecture, drawing inspiration from the Qwen paper for its design principles. The model has 7 billion parameters and supports a context length of 32768 tokens.

Key Capabilities

  • Instruction Following: Fine-tuned to adhere to instructions provided in the Alpaca format.
  • Multilingual Support: Training data includes both English and Japanese datasets, making it suitable for tasks in both languages.
  • Diverse Training Data: Utilizes a curated subset of datasets such as Databricks Dolly (English and Japanese versions), FLAN Instruction Tuning data (English and Japanese translations), and specific sections of the Izumi lab LLM Japanese dataset.

Good For

  • General Instruction-Following: Capable of handling a wide range of prompts and tasks based on given instructions.
  • Japanese Language Tasks: Particularly strong in Japanese due to the inclusion of extensive Japanese instruction and general text datasets during fine-tuning.
  • Research and Development: Provides a solid foundation for further experimentation and fine-tuning on specific downstream tasks, especially within the Japanese NLP domain.

For detailed benchmarking results, refer to rinna's LM benchmark page (Sheet 20231221).