bofenghuang/Meta-Llama-3-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 23, 2024License:llama3Architecture:Transformer0.0K Warm

Meta-Llama-3-8B is an 8 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and is fine-tuned using SFT and RLHF for dialogue use cases. Trained on over 15 trillion tokens with an 8k context length, it significantly outperforms Llama 2 models on common benchmarks like MMLU (68.4) and HumanEval (62.2), making it highly effective for assistant-like chat and general natural language generation tasks in English.

Loading preview...

Meta-Llama-3-8B: An Advanced 8B Parameter LLM from Meta

Meta-Llama-3-8B is an 8 billion parameter instruction-tuned large language model developed by Meta, designed for generative text and code. As part of the Llama 3 family, it leverages an optimized transformer architecture and incorporates Grouped-Query Attention (GQA) for enhanced inference scalability. The instruction-tuned variant is specifically optimized for dialogue use cases, demonstrating superior performance compared to previous Llama 2 models on various industry benchmarks.

Key Capabilities

  • Optimized for Dialogue: Instruction-tuned for assistant-like chat applications, aligning with human preferences for helpfulness and safety through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
  • Strong Benchmark Performance: Achieves 68.4 on MMLU (5-shot), 62.2 on HumanEval (0-shot), and 79.6 on GSM-8K (8-shot, CoT), significantly surpassing Llama 2 7B and 13B models.
  • Extensive Training Data: Pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of March 2023, and fine-tuned with over 10 million human-annotated examples.
  • 8K Context Length: Supports an 8,000 token context window, enabling processing of longer inputs and generating more coherent responses.
  • Responsible AI Focus: Developed with a strong emphasis on safety, including extensive red teaming, adversarial evaluations, and mitigations to reduce residual risks and false refusals, making it more helpful than Llama 2.

Good for

  • Building highly performant English-language chatbots and virtual assistants.
  • General natural language generation tasks requiring high accuracy and coherence.
  • Applications benefiting from strong reasoning and code generation capabilities, as indicated by its HumanEval and GSM-8K scores.
  • Developers seeking a powerful, openly available model with robust safety considerations for commercial and research use.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p