WizardLM-2 8x22B: Advanced Multilingual MoE Model

WizardLM-2 8x22B is the most advanced model in the WizardLM-2 family, developed by WizardLM@Microsoft AI. This 141 billion parameter Mixture of Experts (MoE) model is built on the mistral-community/Mixtral-8x22B-v0.1 base and is designed for superior performance in complex chat, multilingual communication, reasoning, and agent-based applications.

Key Capabilities & Performance

Competitive Performance: Demonstrates highly competitive performance against leading proprietary models and consistently outperforms existing state-of-the-art open-source models.
Multilingual Support: Engineered for robust performance across multiple languages.
Human Preferences: Achieves strong results in human preference evaluations, performing just slightly behind GPT-4-1106-preview and significantly stronger than Command R Plus and GPT4-0314 across tasks like writing, coding, math, reasoning, and agent interactions.
MT-Bench Evaluation: Shows highly competitive scores on the automatic MT-Bench evaluation framework.

Training Methodology

The model was trained using a fully AI-powered synthetic training system, a novel approach detailed in the WizardLM-2 release blog post.

Usage Notes

WizardLM-2 adopts the Vicuna prompt format for multi-turn conversations. Users should follow the specified prompt structure for optimal interaction.