KARAKURI LM 70B Chat v0.1 Overview
KARAKURI LM 70B Chat v0.1 is a 70 billion parameter language model developed by KARAKURI Inc., based on the Llama 2 architecture. It has been specifically enhanced for bilingual performance, incorporating additional Japanese vocabulary and undergoing further pretraining on a diverse mix of Japanese and multilingual datasets. The model was fine-tuned using the SteerLM technique, employing a continual learning approach that integrated both structured conversational datasets and unstructured corpora.
Key Capabilities & Performance
- Bilingual Proficiency: Optimized for both English and Japanese, demonstrating strong performance in both languages.
- Leading Japanese Performance: Achieved the highest score among Japanese open models on the MT-Bench-jp benchmark at the time of its release (6.43).
- English Parity: Maintains performance comparable to Llama 2 70B Chat on the original English MT-Bench (6.61 vs. 6.86).
- Customizable Responses: Supports a unique prompt template allowing users to adjust attributes like helpfulness, correctness, complexity, and creativity via encoded strings in user messages.
Training Details
The model was pretrained on a mixture of Japanese and multilingual corpora and fine-tuned on datasets including OASST2 and internal conversational data. Training was conducted on 32 Amazon EC2 trn1.32xlarge instances.
Good For
- Applications requiring high-quality conversational AI in Japanese.
- Use cases demanding strong bilingual (English-Japanese) language understanding and generation.
- Developers looking for a Llama 2-based model with enhanced Japanese capabilities and fine-tuning flexibility.