Overview
Overview
Shisa V2.1 is an updated family of bilingual Japanese and English (JA/EN) general-purpose chat models from Shisa.AI, with this specific model being the 70 billion parameter variant based on Llama 3.3. It features a 128K context length and is designed to deliver class-leading Japanese language performance while retaining robust English capabilities. The V2.1 series incorporates an updated dataset, refined SFT and DPO recipes, and new additions for improved instruction following, translation, and Japanese-specific language nuances.
Key Capabilities
- Bilingual Proficiency: Excels in both Japanese and English language tasks, with a strong focus on Japanese performance.
- Enhanced Instruction Following: Benefits from new datasets and training methods aimed at better adherence to instructions.
- Reduced Cross-Lingual Token Leakage (CLTL): Significantly minimizes the output of non-Japanese tokens in Japanese text, a critical improvement for production use cases.
- Improved Performance: The Shisa V2.1 70B model approaches the performance of the previous Shisa V2 405B model in Japanese language benchmarks.
Good For
- Japanese Language Applications: Ideal for chatbots, customer service, and content generation requiring high-quality Japanese output.
- Bilingual Systems: Suitable for environments needing seamless switching and robust performance in both Japanese and English.
- Production Deployments: Addresses critical issues like Cross-Lingual Token Leakage, making it more reliable for real-world Japanese-language tasks.