m-a-p/OpenLLaMA-Reproduce-2030.04B
OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. The model leverages a Llama2-like learning rate schedule and a large batch size for optimized training. Its primary strength lies in generating text across a wide array of topics due to its comprehensive training data.
Loading preview...
OpenLLaMA 7Bv2 Overview
OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It distinguishes itself through its comprehensive training on a diverse composite dataset, which includes web-crawled data, scholarly articles, and extensive literature, ensuring broad applicability across various domains.
Key Capabilities
- Broad Domain Understanding: Trained on a composite dataset encompassing the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv papers, books, and Stack Exchange data, enabling it to handle a wide range of topics.
- Contextually Relevant Text Generation: Designed to produce outputs that are highly relevant to the given context.
Training Details
- Optimized Learning: Utilizes a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens.
- Llama2-like Scheduling: Employs a learning rate scheduling strategy similar to Llama2 for stable and efficient convergence.
Good For
- Applications requiring general-purpose text generation with strong contextual understanding.
- Tasks benefiting from a model trained on a wide variety of data sources, from web content to academic papers.