m-a-p/OpenLLaMA-Reproduce-754.97B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-754.97B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature. This model is optimized for broad domain coverage and applicability, leveraging a training strategy similar to Llama2 for robust performance.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focused on generating high-quality and contextually relevant text. Its training regimen emphasizes broad domain coverage, utilizing a diverse composite dataset to ensure versatility across various applications.

Key Training Details

  • Dataset Composition: The model was trained on a rich dataset including:
    • Falcon refined-web dataset
    • starcoder datasets
    • Wikipedia for encyclopedic knowledge
    • arXiv for scientific understanding
    • A wide collection of books
    • Stack Exchange data curated by RedPajama
  • Learning Rate Strategy: Employed a maximum learning rate of 3e-4 and a minimum of 3e-5, with a scheduler closely following the Llama2 strategy for optimal convergence.
  • Batch Size: Utilized a batch size of 4 million tokens to balance training efficiency and performance.

Intended Use Cases

This model is suitable for tasks requiring:

  • General text generation and completion
  • Applications benefiting from broad domain knowledge
  • Contextually aware language understanding