NousResearch/Llama-2-70b-hf
NousResearch/Llama-2-70b-hf is a 69 billion parameter pretrained generative text model from Meta's Llama 2 family, converted for Hugging Face Transformers. This auto-regressive language model uses an optimized transformer architecture and is designed for commercial and research use in English, excelling in natural language generation tasks. The 70B variant incorporates Grouped-Query Attention (GQA) for improved inference scalability and demonstrates strong performance across academic benchmarks including code, commonsense reasoning, and MMLU.
Loading preview...
Llama 2 70B: A Powerful Pretrained Generative Text Model
This model is the 70 billion parameter pretrained variant of Meta's Llama 2 family, optimized for the Hugging Face Transformers format. Llama 2 models are a collection of pretrained and fine-tuned generative text models developed by Meta, ranging from 7B to 70B parameters. This specific version is a large-scale, auto-regressive language model built on an optimized transformer architecture.
Key Characteristics & Performance
- Parameter Count: 70 billion parameters, making it suitable for complex natural language generation tasks.
- Architecture: Utilizes an optimized transformer architecture, with the 70B model specifically incorporating Grouped-Query Attention (GQA) for enhanced inference scalability.
- Training Data: Pretrained on 2 trillion tokens of publicly available online data, with a data cutoff of September 2022.
- Performance: Demonstrates strong results on various academic benchmarks, including:
- Code: 37.5% on HumanEval and MBPP.
- Commonsense Reasoning: 71.9% average across multiple benchmarks.
- MMLU: 68.9%.
- TruthfulQA: Achieves 50.18% for truthful and informative generations.
Intended Use Cases
- Commercial and Research: Designed for a broad range of commercial and research applications in English.
- Natural Language Generation: The pretrained model can be adapted for various natural language generation tasks.
Licensing
Use of this model is governed by a custom commercial license from Meta. Users must accept the license on Meta's website to download the weights and tokenizer.