lightblue/suzume-llama-3-8B-multilingual

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 23, 2024License:llama-3Architecture:Transformer0.1K Warm

The lightblue/suzume-llama-3-8B-multilingual model is an 8 billion parameter instruction-tuned causal language model developed by lightblue. It is a multilingual fine-tune of Meta's Llama 3 8B Instruct, specifically enhanced for conversational abilities across multiple languages. This model addresses Llama 3's English-centric responses by integrating nearly 90,000 multilingual conversations, making it suitable for applications requiring robust non-English language interaction while maintaining strong English performance.

Loading preview...

Suzume-Llama-3-8B-Multilingual: Enhanced Multilingual Capabilities

lightblue/suzume-llama-3-8B-multilingual is an 8 billion parameter language model, fine-tuned from Meta's Llama 3 8B Instruct. While Llama 3 demonstrates strong English performance, this Suzume variant significantly expands its multilingual conversational abilities.

Key Capabilities & Differentiators

  • Multilingual Fine-tuning: Enhanced with nearly 90,000 multilingual conversations, allowing it to respond effectively in various languages, unlike the base Llama 3 which often defaults to English.
  • Strong Multilingual Benchmarks: Achieves competitive MT-Bench scores across 6 languages (German, French, Japanese, Russian, Chinese, English), often outperforming or matching models like Nexusflow/Starling-LM-7B-beta.
  • Minimal English Degradation: Maintains strong English performance, with only minimal degradation compared to the original Llama 3 8B Instruct, while vastly improving non-English interaction.
  • Training Data: Trained on a diverse dataset including lightblue/tagengo-gpt4 (76k conversations), megagonlabs/instruction_ja (669 Japanese conversations), and openchat/openchat_sharegpt4_dataset (6k multilingual conversations).

Good For

  • Applications requiring a Llama 3-based model with robust multilingual conversational support.
  • Chatbots and interactive AI systems targeting non-English speaking users or requiring mixed-language interactions.
  • Developers seeking a powerful 8B model that balances strong English performance with broad language coverage.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p