m-a-p/OpenLLaMA-Reproduce-2041.21B
OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. This model is optimized for general-purpose text generation and understanding across various topics, leveraging a training procedure similar to Llama2.
Loading preview...
OpenLLaMA 7Bv2 Overview
OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained on a comprehensive and diverse composite dataset to ensure broad domain coverage and applicability.
Key Capabilities & Training Details
- Diverse Training Data: The model was trained on a rich dataset comprising:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A vast collection of books across multiple genres
- Stack Exchange data curated by RedPajama
- Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. Its learning rate scheduling closely mirrors the strategy employed in Llama2, contributing to optimal convergence and performance.
Use Cases
This model is well-suited for general text generation tasks, question answering, and applications requiring broad domain understanding due to its extensive and varied training data. Its design aims for robust performance in diverse linguistic contexts.