NousResearch/Llama-2-13b-hf
NousResearch/Llama-2-13b-hf is a 13 billion parameter pretrained generative text model from the Llama 2 family developed by Meta, converted for Hugging Face Transformers. This auto-regressive language model uses an optimized transformer architecture and was trained on 2 trillion tokens with a 4096-token context length. It is intended for commercial and research use in English for various natural language generation tasks.
Loading preview...
Model Overview
NousResearch/Llama-2-13b-hf is a 13 billion parameter pretrained model from Meta's Llama 2 family, adapted for Hugging Face. Llama 2 models are a collection of generative text models ranging from 7B to 70B parameters, built on an optimized transformer architecture. This specific model was trained on 2 trillion tokens of publicly available online data with a context length of 4096 tokens.
Key Capabilities
- Generative Text: Capable of various natural language generation tasks.
- Optimized Architecture: Utilizes an optimized transformer architecture for efficient performance.
- Extensive Pretraining: Trained on a diverse dataset of 2 trillion tokens, enhancing its general language understanding.
Intended Use Cases
- Commercial and Research: Suitable for both commercial applications and academic research in English.
- Natural Language Generation: Can be adapted for a wide array of tasks requiring text generation.
Performance Highlights
Compared to its Llama 1 13B predecessor, Llama 2 13B shows improvements across various academic benchmarks, including Code (24.5 vs 18.9), Math (28.7 vs 10.9), and MMLU (54.8 vs 46.9). The model was trained between January and July 2023, with pretraining data cutoff at September 2022.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.