m-a-p/OpenLLaMA-Reproduce-1828.72B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold
m-a-p/OpenLLaMA-Reproduce-1828.72B is a 7 billion parameter OpenLLaMA model, developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs, ensuring broad domain coverage. This model is particularly suited for applications requiring general-purpose language understanding and generation across various topics.
Loading preview...
OpenLLaMA 7Bv2 Overview
m-a-p/OpenLLaMA-Reproduce-1828.72B is a 7 billion parameter language model, part of the OpenLLaMA family, focused on generating high-quality, contextually relevant text. Its training leverages a diverse composite dataset, ensuring broad applicability across various domains.
Key Capabilities & Training
- Broad Domain Coverage: Trained on a rich dataset including web-crawled data (Falcon refined-web), code (starcoder datasets), encyclopedic knowledge (Wikipedia), scientific understanding (arXiv), and extensive literature and Q&A pairs (books, Stack Exchange data curated by RedPajama).
- Optimized Training Procedure: Utilizes a maximum learning rate of 3e-4 and a minimum of 3e-5, with a substantial batch size of 4 million tokens. The learning rate scheduler follows the strategy used in Llama2 for efficient convergence.
Good For
- General-purpose text generation: Its diverse training data makes it suitable for a wide array of text prediction tasks.
- Contextual understanding: Designed to provide contextually relevant outputs across various topics.
- Applications requiring broad knowledge: Benefits from its training on encyclopedic, scientific, and literary sources.