Name: m-a-p/OpenLLaMA-Reproduce-754.97B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: m-a-p

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model developed by m-a-p, focused on generating high-quality and contextually relevant text. Its training regimen emphasizes broad domain coverage, utilizing a diverse composite dataset to ensure versatility across various applications.

Key Training Details

Dataset Composition: The model was trained on a rich dataset including:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A wide collection of books
- Stack Exchange data curated by RedPajama
Learning Rate Strategy: Employed a maximum learning rate of 3e-4 and a minimum of 3e-5, with a scheduler closely following the Llama2 strategy for optimal convergence.
Batch Size: Utilized a batch size of 4 million tokens to balance training efficiency and performance.

Intended Use Cases

This model is suitable for tasks requiring:

General text generation and completion
Applications benefiting from broad domain knowledge
Contextually aware language understanding

Overview

OpenLLaMA 7Bv2 Overview

Key Training Details

Intended Use Cases

Full Model Card (README)