TheBloke/wizard-vicuna-13B-HF
TheBloke/wizard-vicuna-13B-HF is a 13 billion parameter language model, converted to float16 format for easier GPU inference, based on junelee's WizardVicunaLM. This model combines the in-depth dataset handling of WizardLM with Vicuna's multi-round conversation tuning method. It is designed to improve conversational capabilities, showing approximately 7% performance improvement over VicunaLM in preliminary GPT-4 scored evaluations.
Loading preview...
Wizard-Vicuna-13B-HF Overview
This model, created by junelee and converted to float16 by TheBloke for optimized GPU inference, is a 13 billion parameter language model. It merges the strengths of two prominent LLMs: WizardLM's approach to deep and broad dataset handling, and VicunaLM's method for overcoming single-turn conversation limitations through multi-round interactions.
Key Capabilities & Features
- Enhanced Conversational Ability: Designed to extend single commands into rich, continuous conversations, improving dialogue flow.
- Hybrid Training Approach: Utilizes WizardLM's dataset expansion combined with Vicuna's fine-tuning techniques for conversational formats.
- Performance Improvement: Preliminary evaluations, scored by GPT-4, indicate an approximate 7% performance improvement over VicunaLM-13B.
- Optimized for Inference: Provided in float16 format, reducing VRAM and disk space requirements compared to the original float32 version.
When to Use This Model
- Conversational AI: Ideal for applications requiring more dynamic and extended multi-turn dialogues.
- Research & Experimentation: Suitable for exploring hybrid fine-tuning methodologies combining different LLM strengths.
- Resource-Efficient Deployment: The float16 conversion makes it more accessible for GPU inference with reduced memory footprint.