LLaMA-7B: A Foundational Model for LLM Research

LLaMA-7B is a 6.7 billion parameter causal decoder-only language model developed by the FAIR team at Meta AI. It was trained on a substantial 1 trillion token corpus, encompassing diverse datasets such as CCNet, C4, GitHub, Wikipedia, and Books. While the training data includes multiple languages, the model is primarily optimized for English text generation due to the dataset composition.

Key Capabilities & Characteristics

Base Model: LLaMA-7B is a foundational model, designed for further research and fine-tuning rather than direct deployment in downstream applications.
Training Data: Trained on a 1T token dataset, including a mix of web data, academic sources, and code.
Multilingual Support: Includes data from 20 languages, though performance is expected to be strongest in English.
Research Focus: Intended for exploring applications like question answering and natural language understanding, evaluating model biases, and mitigating harmful content generation.

Intended Use Cases

LLM Research: Ideal for researchers studying large language models, their capabilities, and limitations.
Bias and Harm Mitigation: Useful for evaluating and developing techniques to address biases, risks, and toxic content generation.
Foundation for Fine-tuning: Serves as a strong base model for further fine-tuning for specific applications, provided thorough risk evaluation and mitigation are performed.

It's important to note that LLaMA-7B is released under a bespoke non-commercial license and, as a base model, has not been trained with human feedback, meaning it may generate unhelpful, incorrect, or offensive content without further fine-tuning and safety measures.

Overview

LLaMA-7B: A Foundational Model for LLM Research

Key Capabilities & Characteristics

Intended Use Cases

Full Model Card (README)