NousResearch/Llama-2-13b-hf

Warm
Public
13B
FP8
4096
Hugging Face
Overview

Model Overview

NousResearch/Llama-2-13b-hf is a 13 billion parameter pretrained model from Meta's Llama 2 family, adapted for Hugging Face. Llama 2 models are a collection of generative text models ranging from 7B to 70B parameters, built on an optimized transformer architecture. This specific model was trained on 2 trillion tokens of publicly available online data with a context length of 4096 tokens.

Key Capabilities

  • Generative Text: Capable of various natural language generation tasks.
  • Optimized Architecture: Utilizes an optimized transformer architecture for efficient performance.
  • Extensive Pretraining: Trained on a diverse dataset of 2 trillion tokens, enhancing its general language understanding.

Intended Use Cases

  • Commercial and Research: Suitable for both commercial applications and academic research in English.
  • Natural Language Generation: Can be adapted for a wide array of tasks requiring text generation.

Performance Highlights

Compared to its Llama 1 13B predecessor, Llama 2 13B shows improvements across various academic benchmarks, including Code (24.5 vs 18.9), Math (28.7 vs 10.9), and MMLU (54.8 vs 46.9). The model was trained between January and July 2023, with pretraining data cutoff at September 2022.