m-a-p/OpenLLaMA-Reproduce-335.54B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, designed for high-quality, contextually relevant text predictions. It is trained on a diverse composite dataset including web-crawled data, scholarly articles, and literature. This model focuses on broad domain coverage and applicability, making it suitable for general-purpose text generation tasks.

Loading preview...

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, engineered to provide high-quality and contextually relevant text predictions. It leverages a diverse training dataset to ensure broad domain coverage and applicability across various tasks.

Key Characteristics

  • Diverse Training Data: The model was trained on a composite dataset that includes the Falcon refined-web dataset, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama. This comprehensive data mix aims to provide a wide range of knowledge and understanding.
  • Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduling strategy closely follows that used in Llama2, contributing to efficient and stable convergence.

Potential Use Cases

  • General Text Generation: Suitable for generating coherent and contextually appropriate text across various topics due to its broad training data.
  • Question Answering: Its inclusion of Stack Exchange and Wikipedia data suggests potential for factual question-answering tasks.
  • Content Creation: Can be applied to tasks requiring diverse knowledge, such as drafting articles, summaries, or creative content.