OpenLLaMA: An Open Reproduction of LLaMA

OpenLLaMA is a permissively licensed, open-source reproduction of Meta AI's LLaMA large language model, developed by Xinyang Geng and Hao Liu from Berkeley AI Research. This repository provides 7 billion and 3 billion parameter models, with a 13 billion parameter preview also available. The models are trained on the RedPajama dataset, a reproduction of the LLaMA training dataset, using the same preprocessing steps and training hyperparameters as the original LLaMA.

Key Capabilities & Features

Architecture: Based on the LLaMA model architecture.
Training Data: Trained on 1 trillion tokens from the RedPajama dataset (7B and 3B models).
Performance: Achieves comparable performance to the original LLaMA 7B and GPT-J 6B across a wide range of evaluation tasks, including ANLI, ARC Challenge, HellaSwag, and PIQA.
Licensing: Released under the Apache 2.0 license, allowing for broad usage.
Framework Support: Weights are available for use with Hugging Face Transformers and the EasyLM framework.

Usage Considerations

It is advised to avoid using the Hugging Face fast tokenizer for now, as it may lead to incorrect tokenizations. Use LlamaTokenizer or use_fast=False.
The models were trained with a BOS (beginning of sentence) token (id=1), which should be prepended for optimal few-shot evaluation performance.

Good For

Researchers and developers seeking a permissively licensed LLaMA-like model for general language tasks.
Applications requiring a 7B parameter model with performance comparable to LLaMA 7B.
Experimentation and development within the Hugging Face Transformers or EasyLM ecosystems.

Overview

OpenLLaMA: An Open Reproduction of LLaMA

Key Capabilities & Features

Usage Considerations

Good For

Full Model Card (README)