Stable Beluga 2: An Instruction-Following Llama 2 Model

Stable Beluga 2 is a 69 billion parameter language model developed by Stability AI, built upon the Llama 2 70B architecture. It has been fine-tuned using an Orca-style dataset, which emphasizes learning from detailed explanation traces, aiming to enhance its instruction-following capabilities.

Key Capabilities

Advanced Instruction Following: Optimized to understand and execute complex instructions, making it suitable for conversational agents and task-oriented applications.
Large Context Window: Supports a context length of 32768 tokens, allowing it to process and generate longer, more coherent texts while maintaining context.
Safety and Ethical Considerations: Designed with safety guidelines in mind, as indicated by its system prompt, which instructs it to be safe and avoid illegal activities.
Llama 2 Foundation: Benefits from the robust architecture and pre-training of the Llama 2 70B model.

Training Details

The model was trained using supervised fine-tuning on an internal Orca-style dataset, employing mixed-precision (BF16) and optimized with AdamW. The training involved specific hyperparameters for different parts of the dataset, including varying batch sizes and learning rates with cosine decay.

Good For

Conversational AI: Its strong instruction-following makes it well-suited for chatbots and interactive assistants.
General Text Generation: Capable of generating diverse and coherent text based on user prompts.
Research and Development: Provides a powerful base for further fine-tuning and experimentation in large language models, particularly for those interested in Orca-style training methodologies.

Overview

Stable Beluga 2: An Instruction-Following Llama 2 Model

Key Capabilities

Training Details

Good For

Full Model Card (README)