stabilityai/StableBeluga2

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Jul 20, 2023Architecture:Transformer0.9K Cold

Stable Beluga 2 is a 69 billion parameter Llama 2-based causal language model developed by Stability AI, fine-tuned on an Orca-style dataset. This model is designed for instruction following, leveraging its large parameter count and specialized training to generate helpful and safe responses. With a context length of 32768 tokens, it is suitable for complex conversational AI and general text generation tasks.

Loading preview...

Stable Beluga 2: An Instruction-Following Llama 2 Model

Stable Beluga 2 is a 69 billion parameter language model developed by Stability AI, built upon the Llama 2 70B architecture. It has been fine-tuned using an Orca-style dataset, which emphasizes learning from detailed explanation traces, aiming to enhance its instruction-following capabilities.

Key Capabilities

  • Advanced Instruction Following: Optimized to understand and execute complex instructions, making it suitable for conversational agents and task-oriented applications.
  • Large Context Window: Supports a context length of 32768 tokens, allowing it to process and generate longer, more coherent texts while maintaining context.
  • Safety and Ethical Considerations: Designed with safety guidelines in mind, as indicated by its system prompt, which instructs it to be safe and avoid illegal activities.
  • Llama 2 Foundation: Benefits from the robust architecture and pre-training of the Llama 2 70B model.

Training Details

The model was trained using supervised fine-tuning on an internal Orca-style dataset, employing mixed-precision (BF16) and optimized with AdamW. The training involved specific hyperparameters for different parts of the dataset, including varying batch sizes and learning rates with cosine decay.

Good For

  • Conversational AI: Its strong instruction-following makes it well-suited for chatbots and interactive assistants.
  • General Text Generation: Capable of generating diverse and coherent text based on user prompts.
  • Research and Development: Provides a powerful base for further fine-tuning and experimentation in large language models, particularly for those interested in Orca-style training methodologies.