Name: m-a-p/OpenLLaMA-Reproduce-1023.41B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: m-a-p

OpenLLaMA 7Bv2 Model Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focused on generating high-quality, contextually relevant text. It distinguishes itself through its comprehensive training data and optimized training methodology.

Key Capabilities & Training Details

Diverse Training Data: The model was trained on a rich composite dataset, ensuring broad domain understanding. This includes:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A vast collection of books
- Stack Exchange data curated by RedPajama
Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. The learning rate scheduler closely mirrors the strategy employed in Llama2, contributing to stable and efficient convergence.

Good For

Applications requiring broad domain knowledge and contextually relevant text generation.
Tasks benefiting from a model trained on a diverse mix of web data, academic papers, and structured Q&A.

Overview

OpenLLaMA 7Bv2 Model Overview

Key Capabilities & Training Details

Good For

Full Model Card (README)