Overview
v2ray/Llama-3-70B-Instruct is a 70 billion parameter instruction-tuned large language model from Meta's Llama 3 family, optimized for dialogue and assistant-like chat applications. It is built on an optimized transformer architecture and incorporates Grouped-Query Attention (GQA) for enhanced inference scalability. The model was trained on over 15 trillion tokens of publicly available data, with fine-tuning data including over 10 million human-annotated examples, and has a knowledge cutoff of December 2023.
Key Capabilities
- Strong Benchmark Performance: Outperforms many open-source chat models on common industry benchmarks, including MMLU (82.0), GPQA (39.5), HumanEval (81.7), GSM-8K (93.0), and MATH (50.4).
- Optimized for Dialogue: Instruction-tuned using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) to align with human preferences for helpfulness and safety in conversational contexts.
- Reduced Refusals: Significantly less likely to falsely refuse to answer benign prompts compared to Llama 2, improving user experience.
- Code Generation: Demonstrates strong performance in coding tasks, achieving 81.7 on HumanEval.
Intended Use Cases
- Assistant-like Chat: Ideal for building conversational AI agents and chatbots.
- Commercial and Research Use: Suitable for a wide range of commercial and research applications requiring natural language generation in English.
- English Language Tasks: Primarily intended for use in English, though fine-tuning for other languages is permissible under its custom commercial license.
Responsible AI
Meta emphasizes responsible AI development, providing resources like the Responsible Use Guide and tools such as Meta Llama Guard 2 and Code Shield to help developers implement safety measures. The model underwent extensive red teaming and adversarial evaluations to mitigate risks, with a focus on reducing false refusals and addressing critical risks like CBRNE, cybersecurity, and child safety.