m-a-p/OpenLLaMA-Reproduce-1933.57B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-1933.57B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. This model is particularly suited for general-purpose text generation and understanding tasks across various knowledge domains.

Loading preview...

OpenLLaMA-Reproduce-1933.57B Overview

OpenLLaMA-Reproduce-1933.57B is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained by m-a-p on a comprehensive and diverse dataset to ensure broad applicability across various domains.

Key Characteristics

  • Diverse Training Data: The model leverages a composite dataset including the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama. This diverse training regimen aims to provide extensive knowledge and contextual understanding.
  • Optimized Training Procedure: Training involved a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy used in Llama2 for optimal convergence.

Use Cases

This model is well-suited for general-purpose natural language processing tasks requiring robust text prediction and contextual understanding, benefiting from its broad training data.