lightblue/suzume-llama-3-8B-japanese

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 22, 2024License:llama-3Architecture:Transformer0.0K Warm

The lightblue/suzume-llama-3-8B-japanese model is an 8 billion parameter Llama 3 variant, fine-tuned by Lightblue for enhanced Japanese language capabilities. It leverages the strong English performance of Llama 3 and specializes in Japanese conversation, having been trained on over 3,000 Japanese dialogues. This model is optimized for Japanese natural language processing tasks and achieves leading performance among 7/8B LLMs on various Japanese benchmarks.

Loading preview...

Suzume-Llama-3-8B-Japanese Overview

lightblue/suzume-llama-3-8B-japanese is an 8 billion parameter language model developed by Lightblue, specifically fine-tuned for Japanese language understanding and generation. It builds upon the robust Llama 3 architecture, addressing its English-centric bias by integrating extensive Japanese conversational data.

Key Capabilities & Differentiators

  • Japanese Language Proficiency: Unlike the base Llama 3, which often defaults to English, this model is specifically trained to respond and converse effectively in Japanese.
  • Superior Japanese Benchmarks: Evaluations indicate that Suzume-Llama-3-8B-Japanese is the top-performing model in the 7/8B parameter class across multiple Japanese language benchmarks.
  • Extensive Japanese Training Data: Fine-tuned on over 3,000 Japanese conversations, including hand-edited datasets, Japanese conversations from GPT-4, and diverse prompts from lmsys-chat-1m used to prompt GPT-4.

Use Cases & Performance

This model is ideal for applications requiring high-quality Japanese text generation and understanding. It is particularly well-suited for chatbots, content creation, and any scenario where accurate and natural Japanese interaction is critical. While a related multilingual model shows slightly higher scores on the Japanese MT-Bench, this dedicated Japanese model demonstrates strong performance, especially when evaluated with Japanese system messages.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p