Alignment-Lab-AI/Meta-Llama-3-8B-instruct-hf

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 24, 2024License:llama3Architecture:Transformer Cold

Meta Llama 3 8B Instruct is an 8 billion parameter instruction-tuned generative text model developed by Meta, optimized for dialogue use cases. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and was trained on over 15 trillion tokens of publicly available online data. This model excels in assistant-like chat applications and demonstrates strong performance across various benchmarks, including MMLU and HumanEval, outperforming previous Llama 2 models.

Loading preview...

Meta Llama 3 8B Instruct: Overview

Meta Llama 3 8B Instruct is an 8 billion parameter, instruction-tuned large language model developed by Meta, designed for dialogue and assistant-like chat applications. It is part of the Llama 3 family, which includes both 8B and 70B parameter variants, and is built upon an optimized transformer architecture incorporating Grouped-Query Attention (GQA) for enhanced inference scalability.

Key Capabilities & Features

  • Optimized for Dialogue: Specifically fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety in conversational settings.
  • Strong Performance: Demonstrates significant improvements over Llama 2 models across various benchmarks. For instance, the 8B Instruct model achieves 68.4 on MMLU (5-shot), 62.2 on HumanEval (0-shot), and 79.6 on GSM-8K (8-shot, CoT).
  • Extensive Training Data: Pretrained on over 15 trillion tokens of publicly available online data, with fine-tuning data including public instruction datasets and over 10 million human-annotated examples.
  • English-focused: Intended for commercial and research use primarily in English, though fine-tuning for other languages is permissible under its custom commercial license.

Good For

  • Assistant-like Chatbots: Its instruction-tuned nature makes it highly suitable for building conversational AI agents.
  • Natural Language Generation: Adaptable for various text generation tasks beyond chat, especially where helpfulness and safety are priorities.
  • Research and Development: Provides a robust base for further fine-tuning and exploration in LLM applications.