m-a-p/OpenLLaMA-Reproduce-2041.21B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. This model is optimized for general-purpose text generation and understanding across various topics, leveraging a training procedure similar to Llama2.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained on a comprehensive and diverse composite dataset to ensure broad domain coverage and applicability.

Key Capabilities & Training Details

  • Diverse Training Data: The model was trained on a rich dataset comprising:
    • Falcon refined-web dataset
    • starcoder datasets
    • Wikipedia for encyclopedic knowledge
    • arXiv for scientific understanding
    • A vast collection of books across multiple genres
    • Stack Exchange data curated by RedPajama
  • Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. Its learning rate scheduling closely mirrors the strategy employed in Llama2, contributing to optimal convergence and performance.

Use Cases

This model is well-suited for general text generation tasks, question answering, and applications requiring broad domain understanding due to its extensive and varied training data. Its design aims for robust performance in diverse linguistic contexts.