dreamgen/WizardLM-2-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 16, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

WizardLM-2 7B, developed by WizardLM@Microsoft AI, is a 7 billion parameter multilingual large language model built upon the Mistral-7B-v0.1 base. It is optimized for complex chat, reasoning, and agent tasks, achieving performance comparable to existing 10x larger open-source models. This model is designed for rapid inference while maintaining strong capabilities across various domains.

Loading preview...

WizardLM-2 7B Overview

WizardLM-2 7B is a 7 billion parameter multilingual large language model developed by WizardLM@Microsoft AI, based on the Mistral-7B-v0.1 architecture. It is part of the next-generation WizardLM-2 family, which focuses on improved performance in complex chat, multilingual understanding, reasoning, and agent capabilities. This 7B variant is highlighted for its speed and ability to achieve performance comparable to open-source models that are ten times larger.

Key Capabilities & Performance

  • Multilingual Support: Designed to handle multiple languages effectively.
  • Complex Chat & Reasoning: Optimized for intricate conversational flows and advanced reasoning tasks.
  • Agent Tasks: Demonstrates proficiency in agent-based applications.
  • Competitive Benchmarks: On MT-Bench, WizardLM-2 7B is a top-performing model among 7B to 70B scale baselines. In human preference evaluations, it is comparable to Qwen1.5-32B-Chat and surpasses Qwen1.5-14B-Chat and Starling-LM-7B-beta on a challenging real-world instruction set covering writing, coding, math, reasoning, and agent tasks.
  • Synthetic Training: The model was trained using a fully AI-powered synthetic training system, as detailed in the WizardLM-2 release blog post.

Usage Notes

WizardLM-2 7B adopts the Vicuna prompt format for multi-turn conversations. Users should follow the USER: Hi ASSISTANT: Hello.</s>USER: Who are you? ASSISTANT: I am WizardLM.</s> structure for optimal interaction.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p