TheBloke/wizard-vicuna-13B-HF

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:May 4, 2023Architecture:Transformer0.0K Cold

TheBloke/wizard-vicuna-13B-HF is a 13 billion parameter language model, converted to float16 format for easier GPU inference, based on junelee's WizardVicunaLM. This model combines the in-depth dataset handling of WizardLM with Vicuna's multi-round conversation tuning method. It is designed to improve conversational capabilities, showing approximately 7% performance improvement over VicunaLM in preliminary GPT-4 scored evaluations.

Loading preview...

Wizard-Vicuna-13B-HF Overview

This model, created by junelee and converted to float16 by TheBloke for optimized GPU inference, is a 13 billion parameter language model. It merges the strengths of two prominent LLMs: WizardLM's approach to deep and broad dataset handling, and VicunaLM's method for overcoming single-turn conversation limitations through multi-round interactions.

Key Capabilities & Features

  • Enhanced Conversational Ability: Designed to extend single commands into rich, continuous conversations, improving dialogue flow.
  • Hybrid Training Approach: Utilizes WizardLM's dataset expansion combined with Vicuna's fine-tuning techniques for conversational formats.
  • Performance Improvement: Preliminary evaluations, scored by GPT-4, indicate an approximate 7% performance improvement over VicunaLM-13B.
  • Optimized for Inference: Provided in float16 format, reducing VRAM and disk space requirements compared to the original float32 version.

When to Use This Model

  • Conversational AI: Ideal for applications requiring more dynamic and extended multi-turn dialogues.
  • Research & Experimentation: Suitable for exploring hybrid fine-tuning methodologies combining different LLM strengths.
  • Resource-Efficient Deployment: The float16 conversion makes it more accessible for GPU inference with reduced memory footprint.