OpenLLaMA 7Bv2 Overview

OpenLLaMA 7Bv2 is a 7 billion parameter language model focused on generating high-quality, contextually relevant text. It was trained on a comprehensive and diverse composite dataset to ensure broad domain coverage and applicability.

Key Capabilities & Training Details

Diverse Training Data: The model was trained on a rich dataset comprising:
- Falcon refined-web dataset
- starcoder datasets
- Wikipedia for encyclopedic knowledge
- arXiv for scientific understanding
- A vast collection of books across multiple genres
- Stack Exchange data curated by RedPajama
Optimized Training Procedure: The training utilized a maximum learning rate of 3e-4 and a minimum of 3e-5, with a batch size of 4 million tokens. Its learning rate scheduling closely mirrors the strategy employed in Llama2, contributing to optimal convergence and performance.

Use Cases

This model is well-suited for general text generation tasks, question answering, and applications requiring broad domain understanding due to its extensive and varied training data. Its design aims for robust performance in diverse linguistic contexts.

Overview

OpenLLaMA 7Bv2 Overview

Key Capabilities & Training Details

Use Cases

Full Model Card (README)