m-a-p/OpenLLaMA-Reproduce-1610.61B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold
OpenLLaMA 7Bv2 by m-a-p is a 7 billion parameter language model designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs, ensuring broad domain coverage. This model is optimized for general-purpose language understanding and generation across various topics.
Loading preview...
OpenLLaMA 7Bv2 Model Overview
OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focusing on generating high-quality and contextually relevant text. It is built to provide broad domain applicability through its extensive training on a diverse dataset.
Key Characteristics
- Diverse Training Data: The model was trained on a composite dataset that includes:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A vast collection of books
- Stack Exchange data curated by RedPajama
- Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. The learning rate scheduler closely follows the strategy used in Llama2 for optimal convergence.
Potential Use Cases
- General Text Generation: Suitable for generating human-like text across various topics and styles.
- Question Answering: Can be applied to answer questions based on its broad knowledge base derived from diverse training data.
- Content Creation: Useful for drafting articles, summaries, or creative content due to its contextual understanding.