Model Overview

Meta's Llama 3 8B is an 8 billion parameter instruction-tuned large language model, part of the Llama 3 family, developed by Meta. It utilizes an optimized transformer architecture and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model was pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of March 2023, and features an 8k token context length.

Key Capabilities

Dialogue Optimization: Specifically instruction-tuned for assistant-like chat and dialogue use cases.
Performance: Outperforms many other open-source chat models on common industry benchmarks, demonstrating significant improvements over Llama 2 models across various categories like MMLU, HumanEval, and GSM-8K.
Safety & Responsibility: Developed with a strong focus on optimizing helpfulness and safety, incorporating extensive red teaming, adversarial evaluations, and safety mitigation techniques. It also features reduced false refusal rates compared to Llama 2.
Code Generation: Achieves a HumanEval score of 62.2, indicating strong code generation capabilities.

Good For

Commercial and Research Applications: Intended for a wide range of commercial and research uses in English.
Assistant-like Chatbots: Ideal for building conversational AI agents and dialogue systems.
Natural Language Generation: Adaptable for various natural language generation tasks, especially in its pretrained variant.
Developers Seeking Enhanced Safety: Incorporates robust safety features and is designed to be used with additional safeguards like Meta Llama Guard 2 and Code Shield for responsible deployment.