Name: tokyotech-llm/Swallow-70b-hf API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: tokyotech-llm

Swallow-70b-hf: A Llama 2-based Model Optimized for Japanese

Swallow-70b-hf is a 70 billion parameter language model developed by TokyoTech-LLM, built upon the Llama 2 architecture. Its core distinction lies in its continual pre-training with extensive Japanese language data, significantly enhancing its performance in Japanese tasks compared to the original Llama 2 models. The model employs a broadened vocabulary tokenizer specifically designed for Japanese, which leads to more efficient text representation and faster inference speeds.

Key Capabilities

Enhanced Japanese Language Proficiency: Demonstrates superior performance across a range of Japanese benchmarks, including JCommonsenseQA, JEMHopQA, NIILC, JSQuAD, XL-Sum, MGSM, and WMT20 machine translation tasks.
Efficient Tokenization: Utilizes a Japanese-optimized tokenizer for faster inference.
Strong Foundation: Benefits from the robust architecture of the Llama 2 family.

When to Use This Model

Japanese NLP Applications: Ideal for tasks requiring high accuracy and efficiency in Japanese language understanding and generation.
Cross-Lingual Research: Suitable for projects involving both Japanese and English, where strong Japanese performance is critical.
Resource-Efficient Inference: The optimized tokenizer can provide benefits in scenarios where inference speed is a concern.

Overview

Swallow-70b-hf: A Llama 2-based Model Optimized for Japanese

Key Capabilities

When to Use This Model

Full Model Card (README)