m-a-p/OpenLLaMA-Reproduce-1610.61B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA 7Bv2 by m-a-p is a 7 billion parameter language model designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs, ensuring broad domain coverage. This model is optimized for general-purpose language understanding and generation across various topics.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focusing on generating high-quality and contextually relevant text. It is built to provide broad domain applicability through its extensive training on a diverse dataset.

Key Characteristics

  • Diverse Training Data: The model was trained on a composite dataset that includes:
    • Falcon refined-web dataset
    • starcoder datasets
    • Wikipedia for encyclopedic knowledge
    • arXiv for scientific understanding
    • A vast collection of books
    • Stack Exchange data curated by RedPajama
  • Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. The learning rate scheduler closely follows the strategy used in Llama2 for optimal convergence.

Potential Use Cases

  • General Text Generation: Suitable for generating human-like text across various topics and styles.
  • Question Answering: Can be applied to answer questions based on its broad knowledge base derived from diverse training data.
  • Content Creation: Useful for drafting articles, summaries, or creative content due to its contextual understanding.