Name: m-a-p/OpenLLaMA-Reproduce-1828.72B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: m-a-p

OpenLLaMA 7Bv2 Overview

m-a-p/OpenLLaMA-Reproduce-1828.72B is a 7 billion parameter language model, part of the OpenLLaMA family, focused on generating high-quality, contextually relevant text. Its training leverages a diverse composite dataset, ensuring broad applicability across various domains.

Key Capabilities & Training

Broad Domain Coverage: Trained on a rich dataset including web-crawled data (Falcon refined-web), code (starcoder datasets), encyclopedic knowledge (Wikipedia), scientific understanding (arXiv), and extensive literature and Q&A pairs (books, Stack Exchange data curated by RedPajama).
Optimized Training Procedure: Utilizes a maximum learning rate of 3e-4 and a minimum of 3e-5, with a substantial batch size of 4 million tokens. The learning rate scheduler follows the strategy used in Llama2 for efficient convergence.

Good For

General-purpose text generation: Its diverse training data makes it suitable for a wide array of text prediction tasks.
Contextual understanding: Designed to provide contextually relevant outputs across various topics.
Applications requiring broad knowledge: Benefits from its training on encyclopedic, scientific, and literary sources.

Overview

OpenLLaMA 7Bv2 Overview

Key Capabilities & Training

Good For

Full Model Card (README)