RiversHaveWings/open_llama_7b_safetensors

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer Open Weights Cold

OpenLLaMA is an open-source reproduction of Meta AI's LLaMA large language model, developed by Xinyang Geng and Hao Liu from Berkeley AI Research. This 7 billion parameter model, trained on 1 trillion tokens from the RedPajama dataset, offers a permissively licensed alternative to the original LLaMA. It is designed for general-purpose language understanding and generation, exhibiting comparable performance to LLaMA 7B and GPT-J 6B across various tasks.

Loading preview...

OpenLLaMA: An Open Reproduction of LLaMA

OpenLLaMA is a permissively licensed, open-source reproduction of Meta AI's LLaMA large language model, developed by Xinyang Geng and Hao Liu from Berkeley AI Research. This repository provides 7 billion and 3 billion parameter models, with a 13 billion parameter preview also available. The models are trained on the RedPajama dataset, a reproduction of the LLaMA training dataset, using the same preprocessing steps and training hyperparameters as the original LLaMA.

Key Capabilities & Features

  • Architecture: Based on the LLaMA model architecture.
  • Training Data: Trained on 1 trillion tokens from the RedPajama dataset (7B and 3B models).
  • Performance: Achieves comparable performance to the original LLaMA 7B and GPT-J 6B across a wide range of evaluation tasks, including ANLI, ARC Challenge, HellaSwag, and PIQA.
  • Licensing: Released under the Apache 2.0 license, allowing for broad usage.
  • Framework Support: Weights are available for use with Hugging Face Transformers and the EasyLM framework.

Usage Considerations

  • It is advised to avoid using the Hugging Face fast tokenizer for now, as it may lead to incorrect tokenizations. Use LlamaTokenizer or use_fast=False.
  • The models were trained with a BOS (beginning of sentence) token (id=1), which should be prepended for optimal few-shot evaluation performance.

Good For

  • Researchers and developers seeking a permissively licensed LLaMA-like model for general language tasks.
  • Applications requiring a 7B parameter model with performance comparable to LLaMA 7B.
  • Experimentation and development within the Hugging Face Transformers or EasyLM ecosystems.