m-a-p/OpenLLaMA-Reproduce-1509.95B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-1509.95B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web data, scholarly articles, and literature, ensuring broad domain coverage. The model's training procedure follows Llama2's learning rate scheduling, optimizing for efficient and performant convergence across various text generation tasks.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2, developed by m-a-p, is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It is distinguished by its training on a comprehensive and diverse composite dataset, which includes web-crawled data, scholarly articles, and extensive literature, ensuring broad applicability across various domains.

Key Capabilities & Training

  • Diverse Knowledge Base: Trained on a rich dataset comprising Falcon refined-web, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama. This broad data exposure enables the model to handle a wide array of topics and question-answer formats.
  • Optimized Training: The training procedure utilized a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. Its learning rate scheduling closely mirrors the strategy employed in Llama2, contributing to optimal convergence and performance.

Good For

  • Generating contextually relevant text across diverse topics.
  • Applications requiring broad domain knowledge, from encyclopedic facts to scientific understanding.
  • Tasks benefiting from a model trained on a Llama2-like optimization strategy.