trinhvanhung/Meta-Llama-3.1-8B-Instruct-Q4_K_M

Cold
Public
8B
FP8
32768
1
Dec 22, 2024
License: llama3.1
Hugging Face
Overview

Overview

This model is the 8 billion parameter instruction-tuned variant from Meta's Llama 3.1 collection, released on July 23, 2024. It is built on an optimized transformer architecture utilizing Grouped-Query Attention (GQA) for enhanced inference scalability. The model boasts an extensive 128k token context length and was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023. Fine-tuning involved supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Multilingual Dialogue: Optimized for assistant-like chat in multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Extended Context: Supports a 128k token context window, enabling processing of longer inputs and generating more coherent, extended responses.
  • Advanced Tool Use: Features robust support for tool use and function calling, with detailed prompt formatting guidelines and integration with Hugging Face Transformers chat templates.
  • Strong Benchmarks: Demonstrates significant improvements over Llama 3 8B Instruct across various benchmarks, particularly in MMLU (CoT), HumanEval (pass@1), GSM-8K (CoT), MATH (CoT), and API-Bank for tool use.

Good For

  • Developing multilingual chat assistants and conversational AI applications.
  • Tasks requiring long-context understanding and generation.
  • Applications leveraging tool use and function calling for enhanced capabilities.
  • Research and commercial use in natural language generation across supported languages.