Name: tokyotech-llm/Swallow-7b-NVE-hf API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tokyotech-llm

Overview

TokyoTech-LLM's Swallow-7b-NVE-hf is a 7 billion parameter language model built upon the Llama 2 architecture. It has undergone continual pre-training with a significant addition of Japanese language data to enhance its proficiency in Japanese. Unlike some other Swallow variants, this 'NVE' (No Vocabulary Expansion) model retains the original Llama 2 tokenizer, focusing on performance improvements through data rather than vocabulary modifications.

Key Capabilities

Enhanced Japanese Performance: Demonstrates notable improvements over the base Llama 2 model across various Japanese benchmarks, including JCommonsenseQA, JEMHopQA, NIILC, and JSQuAD.
Bilingual Proficiency: While optimized for Japanese, it maintains competitive performance on English NLP tasks such as OpenBookQA, TriviaQA, and HellaSwag.
Continual Pre-training: Benefits from additional training on diverse datasets including Japanese Wikipedia, RefinedWeb, Swallow Corpus, and The Pile.

Good for

Applications requiring strong Japanese language understanding and generation.
Researchers and developers looking for a Llama 2-based model with specialized Japanese capabilities.
Use cases where maintaining the original Llama 2 tokenizer is preferred.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)