shisa-ai/shisa-v2-llama3.3-70b
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kLicense:llama3.3Architecture:Transformer0.0K Warm

The shisa-ai/shisa-v2-llama3.3-70b is a 70 billion parameter bilingual Japanese and English (JA/EN) general-purpose chat model developed by Shisa.AI, built on the Llama 3.3 architecture. It is specifically optimized for superior performance in Japanese language tasks while maintaining strong English capabilities. This model leverages an expanded and refined synthetic-data driven approach for post-training optimization, achieving significant performance gains in Japanese benchmarks. It is ideal for applications requiring high-quality, nuanced responses in both Japanese and English.

Loading preview...

Shisa V2 Llama 3.3 70B: Bilingual Japanese/English Chat Model

Shisa V2 is a family of bilingual Japanese and English (JA/EN) general-purpose chat models developed by Shisa.AI, with the shisa-v2-llama3.3-70b being the 70 billion parameter variant. Unlike previous iterations, Shisa V2 models forgo tokenizer extension and continued pre-training, focusing instead on an optimized post-training approach using significantly expanded and refined synthetic data. This strategy has led to substantial performance improvements, particularly in Japanese language tasks.

Key Capabilities & Differentiators

  • Bilingual Excellence: Designed to excel in Japanese language tasks while retaining robust English capabilities, making it suitable for mixed-language environments.
  • Optimized Post-Training: Achieves performance gains through a refined synthetic-data driven approach, rather than costly pre-training or tokenizer modifications.
  • Strong Japanese Performance: Demonstrates superior Japanese output quality compared to its base model, as evidenced by leading scores on various Japanese benchmarks like JA AVG, Shaberi AVG, ELYZA 100, JA MT Bench, Rakuda, and Tengu.
  • Comprehensive Evaluation: Evaluated using a custom "multieval" harness incorporating standard benchmarks and new Japanese-specific evaluations such as shisa-jp-ifeval (instruction-following), shisa-jp-rp-bench (role-play), and shisa-jp-tl-bench (translation proficiency).

Should You Use This Model?

This model is particularly well-suited for use cases requiring high-quality, nuanced responses in both Japanese and English. If your application demands strong performance in Japanese language understanding, generation, translation, or role-playing, shisa-v2-llama3.3-70b offers a compelling solution. Its focus on post-training optimization makes it a strong contender for applications where robust bilingual capabilities are critical.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p