Name: m-a-p/OpenLLaMA-Reproduce-1291.85B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: m-a-p

OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained by m-a-p using a diverse composite dataset to ensure broad domain applicability and understanding across various topics.

Key Training Details

The model's training incorporated a rich and varied dataset, including:

Web-crawled data: Utilizing the Falcon refined-web dataset and starcoder datasets.
Encyclopedic knowledge: Contributions from Wikipedia.
Scientific understanding: Academic papers sourced from arXiv.
Extensive literature: A vast collection of books across multiple genres.
Curated Q&A: Stack Exchange data, specifically curated by RedPajama.

The training procedure involved a maximum learning rate of 3e-4, a minimum learning rate of 3e-5, and a batch size of 4 million tokens. The learning rate scheduler closely followed the strategy used in Llama2, optimizing for gradual adjustments and convergence.

Use Cases

This model is well-suited for applications requiring:

General text generation and completion.
Contextual understanding and response generation.
Tasks benefiting from broad knowledge across web content, academic papers, and literature.

Overview

OpenLLaMA 7Bv2 Overview

Key Training Details

Use Cases

Full Model Card (README)