shisa-ai/shisa-v2-mistral-nemo-12b

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Apr 12, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The shisa-ai/shisa-v2-mistral-nemo-12b is a 12 billion parameter bilingual Japanese and English (JA/EN) general-purpose chat model developed by Shisa.AI. Part of the Shisa V2 family, this model is specifically optimized for Japanese language tasks while maintaining strong English capabilities, leveraging a refined synthetic-data driven post-training approach. It demonstrates significantly improved Japanese output quality compared to its base model, making it suitable for applications requiring high-performance bilingual communication.

Loading preview...

Shisa V2 Mistral-Nemo-12B: Optimized for Bilingual JA/EN Chat

Shisa V2 Mistral-Nemo-12B is a 12 billion parameter model from Shisa.AI, designed for high-quality bilingual Japanese and English chat. Unlike previous iterations that focused on tokenizer extension or costly pre-training, Shisa V2 models, including this one, emphasize a significantly expanded and refined synthetic-data driven post-training approach to achieve substantial performance gains, particularly in Japanese language tasks.

Key Capabilities & Differentiators

  • Bilingual Proficiency: Excels in Japanese language tasks while retaining robust English capabilities, making it ideal for cross-lingual applications.
  • Optimized Post-Training: Achieves improved Japanese output quality through a sophisticated synthetic-data driven fine-tuning process, rather than extensive pre-training or tokenizer modifications.
  • Strong Performance: Demonstrates superior Japanese average scores (72.83 JA AVG) compared to its base model (Mistral-Nemo-Instruct-2407 at 58.44 JA AVG) across various benchmarks like Shaberi, ELYZA 100, and JA MT Bench.
  • Comprehensive Evaluation: Evaluated using a custom "multieval" harness incorporating standard benchmarks (ELYZA Tasks 100, JA MT-Bench, Rakuda, Tengu Bench, llm-jp-eval, MixEval, LiveBench, IFEval, EvalPlus) and new Japanese-specific benchmarks (shisa-jp-ifeval, shisa-jp-rp-bench, shisa-jp-tl-bench).

When to Use This Model

  • Japanese-centric Applications: Ideal for chatbots, content generation, and translation services where high-quality Japanese output is critical.
  • Bilingual Communication: Suitable for scenarios requiring seamless switching and robust performance in both Japanese and English.
  • Research & Development: Offers a strong foundation for further fine-tuning on specific Japanese or bilingual datasets due to its optimized post-training methodology.