m-a-p/OpenLLaMA-Reproduce-754.97B
OpenLLaMA-Reproduce-754.97B is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature. This model is optimized for broad domain coverage and applicability, leveraging a training strategy similar to Llama2 for robust performance.
Loading preview...
OpenLLaMA 7Bv2 Overview
OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focused on generating high-quality and contextually relevant text. Its training regimen emphasizes broad domain coverage, utilizing a diverse composite dataset to ensure versatility across various applications.
Key Training Details
- Dataset Composition: The model was trained on a rich dataset including:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A wide collection of books
- Stack Exchange data curated by RedPajama
- Learning Rate Strategy: Employed a maximum learning rate of 3e-4 and a minimum of 3e-5, with a scheduler closely following the Llama2 strategy for optimal convergence.
- Batch Size: Utilized a batch size of 4 million tokens to balance training efficiency and performance.
Intended Use Cases
This model is suitable for tasks requiring:
- General text generation and completion
- Applications benefiting from broad domain knowledge
- Contextually aware language understanding