RiversHaveWings/open_llama_7b_safetensors
OpenLLaMA is an open-source reproduction of Meta AI's LLaMA large language model, developed by Xinyang Geng and Hao Liu from Berkeley AI Research. This 7 billion parameter model, trained on 1 trillion tokens from the RedPajama dataset, offers a permissively licensed alternative to the original LLaMA. It is designed for general-purpose language understanding and generation, exhibiting comparable performance to LLaMA 7B and GPT-J 6B across various tasks.
Loading preview...
OpenLLaMA: An Open Reproduction of LLaMA
OpenLLaMA is a permissively licensed, open-source reproduction of Meta AI's LLaMA large language model, developed by Xinyang Geng and Hao Liu from Berkeley AI Research. This repository provides 7 billion and 3 billion parameter models, with a 13 billion parameter preview also available. The models are trained on the RedPajama dataset, a reproduction of the LLaMA training dataset, using the same preprocessing steps and training hyperparameters as the original LLaMA.
Key Capabilities & Features
- Architecture: Based on the LLaMA model architecture.
- Training Data: Trained on 1 trillion tokens from the RedPajama dataset (7B and 3B models).
- Performance: Achieves comparable performance to the original LLaMA 7B and GPT-J 6B across a wide range of evaluation tasks, including ANLI, ARC Challenge, HellaSwag, and PIQA.
- Licensing: Released under the Apache 2.0 license, allowing for broad usage.
- Framework Support: Weights are available for use with Hugging Face Transformers and the EasyLM framework.
Usage Considerations
- It is advised to avoid using the Hugging Face fast tokenizer for now, as it may lead to incorrect tokenizations. Use
LlamaTokenizeroruse_fast=False. - The models were trained with a BOS (beginning of sentence) token (id=1), which should be prepended for optimal few-shot evaluation performance.
Good For
- Researchers and developers seeking a permissively licensed LLaMA-like model for general language tasks.
- Applications requiring a 7B parameter model with performance comparable to LLaMA 7B.
- Experimentation and development within the Hugging Face Transformers or EasyLM ecosystems.