m-a-p/OpenLLaMA-Reproduce-973.08B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

The m-a-p/OpenLLaMA-Reproduce-973.08B is a 7 billion parameter language model, part of the OpenLLaMA family, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs. This model is optimized for broad domain coverage and general applicability across various text-based tasks.

Loading preview...

OpenLLaMA 7Bv2 Overview

m-a-p/OpenLLaMA-Reproduce-973.08B is a 7 billion parameter language model, building upon the OpenLLaMA architecture. It is engineered to provide high-quality and contextually relevant text predictions across a wide array of applications.

Training Details

The model was trained on a diverse composite dataset to ensure broad domain coverage. This dataset includes:

  • Falcon refined-web dataset
  • starcoder datasets
  • Wikipedia for encyclopedic knowledge
  • arXiv for scientific understanding
  • A large collection of books
  • Stack Exchange data curated by RedPajama

The training procedure utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy employed in Llama2, contributing to its optimized convergence.

Key Capabilities

  • Contextually relevant text generation: Designed to produce coherent and relevant text based on input context.
  • Broad domain understanding: Leverages a diverse training corpus for applicability across various topics.
  • General-purpose language tasks: Suitable for a wide range of text prediction and understanding tasks due to its comprehensive training data.

Good For

  • Applications requiring general text generation.
  • Tasks benefiting from broad knowledge across web, scientific, and literary domains.
  • Developers seeking a 7B parameter model with a robust training foundation.