Name: tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tokyotech-llm

Llama 3.1 Swallow 8B Instruct v0.1: Enhanced Japanese Capabilities

This model is an 8 billion parameter instruction-tuned variant from the Llama 3.1 Swallow series, developed by tokyotech-llm. It is built by continually pre-training on the Meta Llama 3.1 base models, specifically focusing on significantly enhancing Japanese language capabilities while maintaining strong English performance.

Key Capabilities

Bilingual Proficiency: Excels in both Japanese and English, with a particular focus on Japanese language tasks.
Continual Pre-training: Utilizes approximately 200 billion tokens from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia, and mathematical/coding content.
Instruction-Tuned: Supervised fine-tuning (SFT) was performed using synthetic data specifically designed for Japanese.
Strong Japanese Benchmarks: Achieves leading scores in various Japanese evaluation benchmarks, including JCommonsenseQA, JEMHopQA, NIILC, and JSQuAD, and shows competitive performance in MT-Bench JA.
Llama 3.1 Foundation: Benefits from the robust architecture and tokenizer of the Meta Llama 3.1 models.

Good for

Applications requiring high-quality Japanese language understanding and generation.
Bilingual (Japanese-English) conversational AI and instruction-following tasks.
Research and development in cross-lingual LLM adaptation, particularly for Japanese.