KARAKURI LM 70B v0.1 Overview
KARAKURI LM 70B v0.1 is a 70 billion parameter causal decoder-only transformer language model developed by KARAKURI Inc. It is an extension of Llama 2, specifically enhanced for bilingual English and Japanese capabilities. The model incorporates additional Japanese vocabulary and underwent further pretraining on a diverse mixture of Japanese and multilingual corpora.
Key Capabilities & Performance
This model's chat-tuned version, KARAKURI LM Chat, was fine-tuned using the SteerLM technique with a continual learning approach, integrating both structured conversational datasets and unstructured corpora. Despite Japanese tokens comprising only 2.5% of the conversational datasets, KARAKURI LM 70B Chat v0.1 achieved the highest performance among Japanese open models on the MT-Bench-jp at the time of its release, scoring 6.43. It also demonstrates performance comparable to Llama 2 70B Chat on the original English MT-Bench, with an average score of 6.61.
Training Details
The model was trained on a variety of datasets including mC4, CC100, OSCAR, RedPajama, and KARAKURI's internal Japanese corpora. Training was conducted on 32 nodes of Amazon EC2 trn1.32xlarge instances, utilizing code based on neuronx-nemo-megatron.
Good for
- Applications requiring strong performance in both English and Japanese.
- Chatbot and conversational AI systems targeting Japanese users.
- Developers seeking a Llama 2-based model with enhanced Japanese language understanding and generation.