m-a-p/OpenLLaMA-Reproduce-973.08B
The m-a-p/OpenLLaMA-Reproduce-973.08B is a 7 billion parameter language model, part of the OpenLLaMA family, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs. This model is optimized for broad domain coverage and general applicability across various text-based tasks.
Loading preview...
OpenLLaMA 7Bv2 Overview
m-a-p/OpenLLaMA-Reproduce-973.08B is a 7 billion parameter language model, building upon the OpenLLaMA architecture. It is engineered to provide high-quality and contextually relevant text predictions across a wide array of applications.
Training Details
The model was trained on a diverse composite dataset to ensure broad domain coverage. This dataset includes:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A large collection of books
- Stack Exchange data curated by RedPajama
The training procedure utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy employed in Llama2, contributing to its optimized convergence.
Key Capabilities
- Contextually relevant text generation: Designed to produce coherent and relevant text based on input context.
- Broad domain understanding: Leverages a diverse training corpus for applicability across various topics.
- General-purpose language tasks: Suitable for a wide range of text prediction and understanding tasks due to its comprehensive training data.
Good For
- Applications requiring general text generation.
- Tasks benefiting from broad knowledge across web, scientific, and literary domains.
- Developers seeking a 7B parameter model with a robust training foundation.