SubSir/Meta-Llama-3-8B
Meta's Llama 3 8B is an 8 billion parameter instruction-tuned generative text model, part of the Llama 3 family, utilizing an optimized transformer architecture with Grouped-Query Attention (GQA) and a context length of 8192 tokens. Optimized for dialogue use cases, it is designed for commercial and research applications in English, outperforming many open-source chat models on common benchmarks. The model was trained on over 15 trillion tokens of publicly available online data, with its pretraining data cutoff in March 2023.
Loading preview...
Overview
Meta Llama 3 8B is an 8 billion parameter instruction-tuned large language model developed by Meta. It is built on an optimized transformer architecture, incorporating Grouped-Query Attention (GQA) for enhanced inference scalability. The model is designed for dialogue use cases and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. It supports a context length of 8192 tokens and was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of March 2023.
Key Capabilities
- Instruction Following: Optimized for assistant-like chat and dialogue applications.
- Performance: Outperforms many open-source chat models on industry benchmarks, showing significant improvements over Llama 2 7B across various tasks like MMLU, AGIEval, and HumanEval.
- Safety: Developed with extensive red teaming, adversarial evaluations, and safety mitigations, including reduced false refusal rates compared to Llama 2.
- Code Generation: Achieves a HumanEval score of 62.2, indicating strong code generation capabilities.
Good for
- Commercial and Research Use: Intended for a wide range of applications in English.
- Assistant-like Chatbots: Excels in conversational AI scenarios due to its instruction-tuned nature.
- Natural Language Generation: Adaptable for various text generation tasks.
- Developers: Provides resources and guidance for responsible AI development, including integration with tools like Meta Llama Guard 2 and Code Shield.