tokyotech-llm/Swallow-70b-instruct-v0.1
Swallow-70b-instruct-v0.1 by tokyotech-llm is a 70 billion parameter instruction-tuned language model built upon the Llama 2 architecture, continually pre-trained with extensive Japanese language data. It features a vocabulary expanded with Japanese data for efficient text representation and faster inference. This model excels in Japanese language tasks, demonstrating strong performance across various categories including writing, roleplay, and extraction on the MT-Bench JA benchmark.
Loading preview...
Swallow-70b-instruct-v0.1 Overview
Swallow-70b-instruct-v0.1 is a 70 billion parameter instruction-tuned language model developed by TokyoTech-LLM. It is based on the Llama 2 family, having undergone continual pre-training with a significant addition of Japanese language data. This model utilizes supervised fine-tuning (SFT) to enhance its instruction-following capabilities.
Key Capabilities & Features
- Enhanced Japanese Language Proficiency: Significantly improved performance in Japanese tasks due to extensive pre-training on Japanese data.
- Efficient Tokenization: Employs a tokenizer with a broadened vocabulary for Japanese, leading to more efficient text representation and faster inference.
- Instruction-Tuned: Optimized for following instructions, as evidenced by its performance on the MT-Bench JA benchmark.
- Strong Benchmark Performance: Achieves an overall score of 0.4513 on MT-Bench JA, outperforming previous Swallow versions and other 70B Japanese models like japanese-stablelm-instruct-beta-70b in categories such as writing, roleplay, and extraction.
When to Use This Model
- Japanese Language Applications: Ideal for use cases requiring high-quality Japanese text generation and understanding.
- Instruction Following: Suitable for tasks where precise adherence to instructions is critical.
- Research and Development: A strong candidate for further research and fine-tuning in Japanese NLP contexts.
This model represents a significant advancement in Japanese LLM capabilities, offering improved performance and efficiency for a wide range of applications.