m-a-p/OpenLLaMA-Reproduce-1073.74B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-1073.74B is a 7 billion parameter language model developed by m-a-p, trained to deliver high-quality, contextually relevant text predictions. It leverages a diverse composite dataset including web-crawled data, scholarly articles, and literature for broad domain coverage. This model is designed for general-purpose text generation and understanding tasks, with a 4096 token context length.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset to ensure broad domain coverage and applicability, making it suitable for a wide range of natural language processing tasks.

Key Capabilities

  • Broad Domain Understanding: Trained on a composite dataset including web-crawled data (Falcon refined-web, starcoder), encyclopedic knowledge (Wikipedia), scientific understanding (arXiv), and a vast collection of books and Stack Exchange data.
  • Contextual Text Generation: Focuses on generating text that is both high-quality and contextually relevant, leveraging its extensive training data.
  • General-Purpose NLP: Suitable for various applications requiring text prediction, understanding, and generation.

Training Details

The model's training procedure involved:

  • Learning Rate: A maximum learning rate of 3e-4 and a minimum of 3e-5.
  • Batch Size: An optimized batch size of 4 million tokens.
  • Learning Rate Scheduler: Follows the strategy used in Llama2 for gradual adjustments and optimal convergence.

Good For

  • General text generation and completion tasks.
  • Applications requiring broad knowledge across various domains.
  • Research and development in natural language processing.