m-a-p/OpenLLaMA-Reproduce-436.21B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

m-a-p/OpenLLaMA-Reproduce-436.21B is a 7 billion parameter language model, part of the OpenLLaMA family, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs, ensuring broad domain coverage. This model is optimized for general-purpose text generation and understanding across various topics, leveraging a training strategy similar to Llama2 for optimal convergence.

Loading preview...

OpenLLaMA 7Bv2 Overview

m-a-p/OpenLLaMA-Reproduce-436.21B is a 7 billion parameter language model, focusing on delivering high-quality and contextually relevant text predictions. It is built upon a diverse composite dataset to ensure broad applicability across various domains.

Key Capabilities

  • Broad Domain Understanding: Trained on a comprehensive dataset including web data (Falcon refined-web), code (starcoder datasets), encyclopedic knowledge (Wikipedia), scientific papers (arXiv), and a vast collection of books and Stack Exchange data.
  • Contextual Text Generation: Designed to produce contextually relevant text outputs, making it suitable for a wide range of language tasks.
  • Optimized Training: Utilizes a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy used in Llama2 for efficient and stable convergence.

Good For

  • General-purpose text generation and completion.
  • Applications requiring broad knowledge across web content, scientific, and literary domains.
  • Tasks benefiting from a model trained with a robust and diverse data mixture.