m-a-p/OpenLLaMA-Reproduce-117.44B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024Architecture:Transformer Cold

m-a-p/OpenLLaMA-Reproduce-117.44B is a 7 billion parameter OpenLLaMA model, trained by m-a-p, designed for high-quality, contextually relevant text predictions. It leverages a diverse composite dataset including web-crawled data, scholarly articles, and literature to ensure broad domain coverage. This model is optimized for general-purpose language understanding and generation across various topics.

Loading preview...

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focusing on generating high-quality and contextually relevant text. Its training regimen emphasizes broad domain coverage and applicability, making it suitable for a wide array of natural language processing tasks.

Key Capabilities

  • Diverse Knowledge Base: Trained on a composite dataset including Falcon refined-web, starcoder datasets, Wikipedia, arXiv academic papers, a vast collection of books, and Stack Exchange data curated by RedPajama. This diverse training data ensures comprehensive knowledge across encyclopedic, scientific, and general literary domains.
  • Contextual Understanding: Designed to provide contextually relevant text predictions, indicating strong comprehension of input prompts.
  • Optimized Training: Utilizes a maximum learning rate of 3e-4, a minimum of 3e-5, and a batch size of 4 million tokens. The learning rate scheduling closely follows the strategy used in Llama2 for optimal convergence.

Good For

  • General Text Generation: Capable of producing coherent and relevant text for various applications.
  • Question Answering: Benefits from its training on Stack Exchange data and encyclopedic knowledge for factual queries.
  • Content Creation: Its broad dataset exposure makes it versatile for generating diverse content, from creative writing to informative summaries.