m-a-p/OpenLLaMA-Reproduce-218.1B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It leverages a diverse composite dataset including web-crawled data, scholarly articles, and literature to ensure broad domain coverage. This model is optimized for general-purpose text generation and understanding across various topics, with a context length of 4096 tokens.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on delivering high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset to ensure broad domain coverage and applicability.

Key Training Details

The model's training incorporated a rich and varied dataset, including:

  • Falcon refined-web dataset: For general web knowledge.
  • starcoder datasets: Likely contributing to code understanding or generation capabilities.
  • Wikipedia: Providing encyclopedic knowledge.
  • arXiv: For scientific and academic understanding.
  • Vast collection of books: Enhancing literary comprehension and generation.
  • Stack Exchange data curated by RedPajama: Offering structured Q&A and technical discussions.

Training Procedure Highlights

The training process utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a substantial batch size of 4 million tokens. The learning rate scheduling closely followed the strategy employed in Llama2, aiming for optimal convergence and performance.

Potential Use Cases

Given its broad training data, OpenLLaMA 7Bv2 is suitable for a wide range of applications requiring:

  • General text generation and completion.
  • Question answering based on diverse knowledge domains.
  • Content creation and summarization.
  • Understanding and processing various forms of text, from academic papers to web content.