Name: tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tokyotech-llm

Llama-3.1-Swallow-8B-Instruct-v0.3 Overview

This model is an 8 billion parameter instruction-tuned variant from the Llama 3.1 Swallow series, developed by tokyotech-llm. It is built upon Meta's Llama 3.1 base models through continual pre-training with a focus on enhancing Japanese language capabilities while maintaining strong English performance. The pre-training involved approximately 200 billion tokens from diverse sources, including a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia, and mathematical/coding content.

Key Capabilities & Features

Bilingual Proficiency: Significantly improved Japanese language understanding and generation, alongside robust English capabilities.
Instruction-Tuned: Optimized for following user instructions and engaging in multi-turn conversations, achieved through supervised fine-tuning on specially built synthetic Japanese data.
State-of-the-Art Japanese MT-Bench Performance: This v0.3 release demonstrates leading performance on Japanese MT-Bench among open-source LLMs with 8 billion parameters or less, showing an 8.4-point improvement over its predecessor.
Llama 3.1 Architecture: Leverages the architectural strengths of the Meta Llama 3.1 series.

Ideal Use Cases

Japanese-centric Applications: Excellent for chatbots, content generation, and conversational AI systems requiring high proficiency in Japanese.
Bilingual AI Assistants: Suitable for applications that need to seamlessly handle both Japanese and English interactions.
Research and Development: A strong foundation for further fine-tuning or research into cross-lingual LLM adaptation, particularly for Japanese language tasks.

Overview

Llama-3.1-Swallow-8B-Instruct-v0.3 Overview

Key Capabilities & Features

Ideal Use Cases

Full Model Card (README)