m-a-p/OpenLLaMA-Reproduce-318.77B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-318.77B is a 7 billion parameter language model from m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature. This model focuses on broad domain coverage and applicability, leveraging a training procedure similar to Llama2 for optimal convergence.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, engineered to provide high-quality and contextually relevant text predictions. Its training emphasizes broad domain coverage and applicability, making it suitable for a wide array of natural language processing tasks. The model's training methodology incorporates a learning rate schedule that closely mirrors the strategy employed in Llama2, ensuring efficient and stable convergence.

Key Capabilities

  • Diverse Knowledge Base: Trained on a composite dataset that includes the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama.
  • Contextual Understanding: Designed to generate contextually relevant text, leveraging its extensive training data for nuanced predictions.
  • Optimized Training: Utilizes a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens, with a learning rate scheduler akin to Llama2 for performance optimization.

Good For

  • General Text Generation: Its broad domain coverage makes it suitable for various text generation tasks.
  • Research and Development: Can be used as a foundational model for further fine-tuning on specific applications.
  • Knowledge-based Applications: Benefits from encyclopedic knowledge from Wikipedia and scientific understanding from arXiv.