m-a-p/OpenLLaMA-Reproduce-1291.85B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-1291.85B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. Trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs, it offers broad domain coverage. This model is particularly suited for general-purpose text generation and understanding tasks across various subjects.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained by m-a-p using a diverse composite dataset to ensure broad domain applicability and understanding across various topics.

Key Training Details

The model's training incorporated a rich and varied dataset, including:

  • Web-crawled data: Utilizing the Falcon refined-web dataset and starcoder datasets.
  • Encyclopedic knowledge: Contributions from Wikipedia.
  • Scientific understanding: Academic papers sourced from arXiv.
  • Extensive literature: A vast collection of books across multiple genres.
  • Curated Q&A: Stack Exchange data, specifically curated by RedPajama.

The training procedure involved a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. The learning rate scheduler closely followed the strategy used in Llama2, optimizing for gradual adjustments and convergence.

Use Cases

This model is well-suited for applications requiring:

  • General text generation and completion.
  • Contextual understanding and response generation.
  • Tasks benefiting from broad knowledge across web content, academic papers, and literature.