m-a-p/OpenLLaMA-Reproduce-1933.57B
OpenLLaMA-Reproduce-1933.57B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. This model is particularly suited for general-purpose text generation and understanding tasks across various knowledge domains.
Loading preview...
OpenLLaMA-Reproduce-1933.57B Overview
OpenLLaMA-Reproduce-1933.57B is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained by m-a-p on a comprehensive and diverse dataset to ensure broad applicability across various domains.
Key Characteristics
- Diverse Training Data: The model leverages a composite dataset including the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama. This diverse training regimen aims to provide extensive knowledge and contextual understanding.
- Optimized Training Procedure: Training involved a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy used in Llama2 for optimal convergence.
Use Cases
This model is well-suited for general-purpose natural language processing tasks requiring robust text prediction and contextual understanding, benefiting from its broad training data.