m-a-p/OpenLLaMA-Reproduce-2030.04B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It was trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature, ensuring broad domain coverage. The model leverages a Llama2-like learning rate schedule and a large batch size for optimized training. Its primary strength lies in generating text across a wide array of topics due to its comprehensive training data.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It distinguishes itself through its comprehensive training on a diverse composite dataset, which includes web-crawled data, scholarly articles, and extensive literature, ensuring broad applicability across various domains.

Key Capabilities

  • Broad Domain Understanding: Trained on a composite dataset encompassing the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv papers, books, and Stack Exchange data, enabling it to handle a wide range of topics.
  • Contextually Relevant Text Generation: Designed to produce outputs that are highly relevant to the given context.

Training Details

  • Optimized Learning: Utilizes a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens.
  • Llama2-like Scheduling: Employs a learning rate scheduling strategy similar to Llama2 for stable and efficient convergence.

Good For

  • Applications requiring general-purpose text generation with strong contextual understanding.
  • Tasks benefiting from a model trained on a wide variety of data sources, from web content to academic papers.