m-a-p/OpenLLaMA-Reproduce-1728.05B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

m-a-p/OpenLLaMA-Reproduce-1728.05B is a 7 billion parameter language model, part of the OpenLLaMA family, focused on high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature. This model is designed for broad domain applicability, leveraging a training procedure that closely follows the Llama2 learning rate scheduling strategy.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

m-a-p/OpenLLaMA-Reproduce-1728.05B is a 7 billion parameter language model, building on the OpenLLaMA 7Bv2 architecture. It is engineered to provide high-quality, contextually relevant text predictions across a wide array of topics.

Key Capabilities & Training

This model was trained on a comprehensive composite dataset to ensure broad domain coverage. The training data includes:

  • Web-crawled data: Falcon refined-web dataset and starcoder datasets.
  • Encyclopedic knowledge: Contributions from Wikipedia.
  • Scientific understanding: Academic papers from arXiv.
  • Literature: A vast collection of books across multiple genres.
  • Question-answer pairs: Stack Exchange data curated by RedPajama.

The training procedure utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. Its learning rate scheduling closely mirrors the strategy employed in Llama2, optimizing for efficient and stable convergence.

Use Cases

Given its diverse training data, this model is well-suited for applications requiring:

  • General text generation and completion.
  • Contextual understanding across various domains.
  • Knowledge retrieval from encyclopedic and scientific sources.
  • Processing and generating content based on literature and Q&A formats.