sharpbai/open_llama_13b
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The sharpbai/open_llama_13b model is a 13 billion parameter causal language model, developed by OpenLM Research, that serves as an open-source reproduction of Meta AI's LLaMA architecture. Trained on 1 trillion tokens from the RedPajama dataset, it offers a permissively licensed alternative to LLaMA. This model is suitable for general-purpose language generation tasks, demonstrating comparable performance to the original LLaMA and GPT-J models across various benchmarks.

Loading preview...

OpenLLaMA 13B: An Open Reproduction of LLaMA

This model, sharpbai/open_llama_13b, is a 13 billion parameter large language model developed by OpenLM Research. It is an open-source, permissively licensed reproduction of Meta AI's LLaMA architecture, trained on 1 trillion tokens from the RedPajama dataset. The training methodology closely follows the original LLaMA paper, including architecture, context length (4096 tokens), and hyperparameters, with the primary difference being the use of the RedPajama dataset.

Key Capabilities & Features

  • LLaMA Architecture Reproduction: Faithfully replicates the LLaMA model architecture.
  • Permissive Licensing: Released under the Apache 2.0 license, allowing for broad use.
  • Extensive Training Data: Trained on 1 trillion tokens from the RedPajama dataset.
  • Comparable Performance: Achieves performance comparable to the original LLaMA 13B and GPT-J 6B models across a range of evaluation tasks, including ARC Challenge, HellaSwag, and BoolQ.
  • Hugging Face Transformers Integration: Easily loadable and usable with the Hugging Face transformers library, though it's advised to use LlamaTokenizer or use_fast=False for AutoTokenizer due to observed tokenization issues with the fast tokenizer.

When to Use This Model

  • Open-source LLaMA Alternative: Ideal for developers seeking a LLaMA-like model with a permissive license.
  • General Language Generation: Suitable for various natural language processing tasks where a 13B parameter model is appropriate.
  • Research and Development: Provides a strong baseline for further research, fine-tuning, or experimentation with LLaMA-style models.