Name: tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tokyotech-llm

Llama-3.1-Swallow-8B-Instruct-v0.2 Overview

This model is an 8 billion parameter instruction-tuned variant from the Llama 3.1 Swallow series, developed by tokyotech-llm. It is built by continually pre-training the original Meta Llama 3.1 models, specifically enhancing Japanese language proficiency while preserving English capabilities. The pre-training involved approximately 200 billion tokens from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia, and mathematical/coding content.

Key Capabilities & Features

Enhanced Japanese Performance: Achieves a Japanese average score of 0.5141 across various benchmarks, outperforming other Llama 3 and Qwen2 models in its size class.
Strong English Performance: Maintains competitive English task performance with an average score of 0.5823.
Instruction-Tuned: Fine-tuned using synthetic Japanese and English datasets, including Llama-3.1-LMSYS-Chat-1M-Synth-Ja, Swallow-Magpie-Ultra-v0.1, and filtered-magpie-ultra-en.
Multi-turn Dialogue: Demonstrates solid performance on the MT-Bench JA, scoring 0.5584.

Good For

Applications requiring robust performance in both Japanese and English.
Tasks involving question answering, summarization, and code generation in Japanese contexts.
Multi-turn conversational AI systems targeting Japanese users.