m-a-p/OpenLLaMA-Reproduce-1073.74B
OpenLLaMA-Reproduce-1073.74B is a 7 billion parameter language model developed by m-a-p, trained to deliver high-quality, contextually relevant text predictions. It leverages a diverse composite dataset including web-crawled data, scholarly articles, and literature for broad domain coverage. This model is designed for general-purpose text generation and understanding tasks, with a 4096 token context length.
Loading preview...
OpenLLaMA 7Bv2 Model Overview
OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset to ensure broad domain coverage and applicability, making it suitable for a wide range of natural language processing tasks.
Key Capabilities
- Broad Domain Understanding: Trained on a composite dataset including web-crawled data (Falcon refined-web, starcoder), encyclopedic knowledge (Wikipedia), scientific understanding (arXiv), and a vast collection of books and Stack Exchange data.
- Contextual Text Generation: Focuses on generating text that is both high-quality and contextually relevant, leveraging its extensive training data.
- General-Purpose NLP: Suitable for various applications requiring text prediction, understanding, and generation.
Training Details
The model's training procedure involved:
- Learning Rate: A maximum learning rate of 3e-4 and a minimum of 3e-5.
- Batch Size: An optimized batch size of 4 million tokens.
- Learning Rate Scheduler: Follows the strategy used in Llama2 for gradual adjustments and optimal convergence.
Good For
- General text generation and completion tasks.
- Applications requiring broad knowledge across various domains.
- Research and development in natural language processing.