m-a-p/OpenLLaMA-Reproduce-503.32B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

OpenLLaMA-Reproduce-503.32B is a 7 billion parameter language model developed by m-a-p, trained to deliver high-quality, contextually relevant text predictions. It utilizes a diverse composite dataset including web-crawled data, scholarly articles, and question-answer pairs for broad domain coverage. The model's training procedure incorporates a Llama2-like learning rate scheduler and a batch size of 4 million tokens, focusing on efficient and performant language generation.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2, developed by m-a-p, is a 7 billion parameter language model designed for generating high-quality, contextually relevant text. It is distinguished by its training on a diverse composite dataset, ensuring broad domain applicability and robust understanding across various topics.

Key Training Details

  • Diverse Dataset: Trained on a comprehensive dataset including Falcon refined-web, starcoder datasets, Wikipedia, arXiv academic papers, a wide collection of books, and RedPajama's Stack Exchange data. This blend aims to provide encyclopedic knowledge, scientific understanding, and general literature comprehension.
  • Optimized Training Procedure: The model was trained using a maximum learning rate of 3e-4 and a minimum of 3e-5. It employed a substantial batch size of 4 million tokens to enhance training efficiency and performance. The learning rate scheduling strategy closely mirrors that used in Llama2, contributing to stable and optimal convergence.

Potential Use Cases

  • General Text Generation: Capable of producing coherent and contextually appropriate text for a wide range of applications.
  • Knowledge-based Q&A: Its training on encyclopedic and scientific data makes it suitable for answering questions requiring broad factual knowledge.
  • Content Creation: Can assist in generating diverse content, leveraging its exposure to various literary genres and web data.