abacusai/Smaug-Llama-3-70B-Instruct

Warm
Public
70B
FP8
8192
License: llama3
Hugging Face
Overview

Smaug-Llama-3-70B-Instruct Overview

Smaug-Llama-3-70B-Instruct is a 70 billion parameter instruction-tuned model developed by Abacus.AI, built upon Meta's Llama-3-70B-Instruct. It incorporates a new "Smaug recipe" specifically designed to enhance performance in real-world multi-turn conversations.

Key Capabilities & Performance

  • Superior Conversational AI: The model demonstrates substantial improvements over the base Llama-3-70B-Instruct, particularly in multi-turn dialogue scenarios.
  • Competitive Benchmarking: On MT-Bench, Smaug-Llama-3-70B-Instruct achieves an average score of 9.21, placing it on par with GPT-4-Turbo (9.19) and significantly ahead of its base model (9.01).
  • Arena-Hard Leaderboard: It is currently the top open-source model on Arena-Hard, scoring 56.7, nearly matching Claude-3-Opus-20240229 (60.4) and substantially outperforming Llama-3-70B-Instruct (41.1).
  • Instruction Following: The model maintains the Llama 3 70B Instruct prompt format, ensuring seamless integration for users familiar with the base model.

Unique Aspects

This model leverages new techniques and data compared to previous Smaug iterations, with further details to be released. The underlying methodology is based on the Smaug paper: https://arxiv.org/abs/2402.13228.

Considerations

While excelling in conversational benchmarks, the model's performance on traditional academic benchmarks like ARC, Hellaswag, MMLU, TruthfulQA, Winogrande, and GSM8K is largely consistent with or slightly below the base Llama-3-70B-Instruct. Notably, GSM8K scores were re-evaluated using an updated LM Evaluation Harness to correct for a known issue affecting models using colons in responses.