alpindale/WizardLM-2-8x22B

Warm
Public
141B
FP8
32768
License: apache-2.0
Hugging Face
Overview

WizardLM-2 8x22B: Advanced Multilingual MoE Model

WizardLM-2 8x22B is the most advanced model in the WizardLM-2 family, developed by WizardLM@Microsoft AI. This 141 billion parameter Mixture of Experts (MoE) model is built on the mistral-community/Mixtral-8x22B-v0.1 base and is designed for superior performance in complex chat, multilingual communication, reasoning, and agent-based applications.

Key Capabilities & Performance

  • Competitive Performance: Demonstrates highly competitive performance against leading proprietary models and consistently outperforms existing state-of-the-art open-source models.
  • Multilingual Support: Engineered for robust performance across multiple languages.
  • Human Preferences: Achieves strong results in human preference evaluations, performing just slightly behind GPT-4-1106-preview and significantly stronger than Command R Plus and GPT4-0314 across tasks like writing, coding, math, reasoning, and agent interactions.
  • MT-Bench Evaluation: Shows highly competitive scores on the automatic MT-Bench evaluation framework.

Training Methodology

The model was trained using a fully AI-powered synthetic training system, a novel approach detailed in the WizardLM-2 release blog post.

Usage Notes

WizardLM-2 adopts the Vicuna prompt format for multi-turn conversations. Users should follow the specified prompt structure for optimal interaction.