abacusai/Llama-3-Smaug-8B

Warm
Public
8B
FP8
8192
License: llama2
Hugging Face
Overview

Llama-3-Smaug-8B: Enhanced Conversational AI

Llama-3-Smaug-8B is an 8 billion parameter instruction-tuned model developed by Abacus.AI, fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct. This model leverages the "Smaug recipe" to significantly improve its performance in multi-turn conversational scenarios.

Key Enhancements & Performance

The Smaug recipe, which involves new techniques and data compared to previous Smaug models, focuses on optimizing the model for more effective and coherent multi-turn interactions. Evaluation on the MT-Bench benchmark highlights its strengths:

  • Improved First-Turn Performance: Llama-3-Smaug-8B achieves a score of 8.78 on the first turn of MT-Bench, outperforming the base Llama-3-8B-Instruct (8.31).
  • Consistent Multi-Turn Capability: It maintains strong performance in subsequent turns, matching the base model's score of 7.89 on the second turn.
  • Higher Overall Average: The model boasts an average MT-Bench score of 8.33, surpassing the 8.10 of its Llama-3-8B-Instruct counterpart.

Use Cases

This model is particularly well-suited for applications requiring robust and nuanced multi-turn conversational abilities, such as advanced chatbots, virtual assistants, and interactive AI systems where initial response quality and sustained coherence are critical. Further details on the Smaug methodology can be found in the previous Smaug paper.