LLaMA-7B Hugging Face Adaptation

This model is the 7 billion parameter version of LLaMA, an auto-regressive language model developed by Meta AI's FAIR team. It is based on the transformer architecture and was trained between December 2022 and February 2023. This specific repository provides an adaptation for seamless integration with Hugging Face's transformers library, supporting AutoModel and AutoTokenizer for straightforward loading, including an option for 8-bit quantization to reduce memory footprint.

Key Characteristics

Architecture: Transformer-based, auto-regressive language model.
Parameters: 7 billion parameters.
Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
Multilingual Support: While primarily English-centric, the training data included 20 languages, suggesting some multilingual capability.
Research Focus: Designed as a foundational model for research into large language models, including exploring applications, understanding limitations, and developing bias/harm mitigation strategies.

Intended Use Cases

Research: Ideal for researchers studying large language models, their capabilities, and limitations.
Application Exploration: Suitable for exploring potential applications such as question answering, natural language understanding, and reading comprehension.
Bias and Harm Mitigation: Useful for evaluating and developing techniques to mitigate biases, risks, toxic content generation, and hallucinations.

Performance Highlights

On common sense reasoning tasks, the 7B model achieved scores such as 76.5 on BoolQ, 79.8 on PIQA, and 76.1 on HellaSwag. It is important to note that LLaMA is a base model and has not been fine-tuned with human feedback, meaning it can generate unhelpful, incorrect, or offensive content. Users should conduct further risk evaluation and mitigation before deploying in downstream applications.

Overview

LLaMA-7B Hugging Face Adaptation

Key Characteristics

Intended Use Cases

Performance Highlights

Full Model Card (README)