Name: shisa-ai/shisa-v2-mistral-small-24b API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: shisa-ai

Shisa V2: Bilingual Japanese/English Chat Models

Shisa V2 is a series of general-purpose chat models developed by Shisa.AI, designed to excel in Japanese language tasks while retaining robust English capabilities. Unlike previous iterations, Shisa V2 focuses on optimizing post-training through an expanded and refined synthetic-data driven approach, rather than tokenizer extension or costly continued pre-training.

Key Capabilities & Features

Bilingual Proficiency: Strong performance in both Japanese and English, with a particular emphasis on Japanese output quality.
Optimized Post-Training: Achieves significant performance gains through advanced synthetic data and fine-tuning techniques.
Robust Model Family: Part of a diverse family ranging from 7B to 70B parameters, all trained with consistent datasets and recipes.
Extensive Evaluation: Benchmarked using a custom "multieval" harness, including standard and newly developed Japanese-specific evaluations like shisa-jp-ifeval, shisa-jp-rp-bench, and shisa-jp-tl-bench.
Flexible Usage: Inherits chat templates from base models and is validated for inference with vLLM and SGLang, with recommendations for temperature and top_p/min_p for different tasks.

Training & Datasets

Shisa V2 models were trained using a comprehensive supervised fine-tuning (SFT) dataset of approximately 360K samples, including a filtered and regenerated version of shisa-ai/shisa-v2-sharegpt, translated prompts, and custom role-playing and instruction-following data. The DPO mix, though smaller, includes English-only preference data that surprisingly outperformed larger JA/EN sets, alongside specific DPO sets for role-playing, translation, instruction-following, and politeness control.

Overview

Shisa V2: Bilingual Japanese/English Chat Models

Key Capabilities & Features

Training & Datasets

Full Model Card (README)