Name: TheBloke/wizard-vicuna-13B-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Model Overview

This model, TheBloke/wizard-vicuna-13B-SuperHOT-8K-fp16, is a 13 billion parameter language model provided in float16 PyTorch format. It is a merge of June Lee's Wizard Vicuna 13B and Kaio Ken's SuperHOT 8K LoRA. The primary differentiator of this model is its significantly extended context window, supporting up to 8192 tokens during inference, achieved through the SuperHOT 8K integration.

Key Capabilities

Extended Context Handling: Leverages an 8K context window, enabling the model to maintain coherence and understanding over much longer inputs and outputs compared to standard models.
Enhanced Conversational Ability: Built upon Wizard Vicuna, which combines the in-depth dataset handling of WizardLM with Vicuna's multi-round conversation tuning methods, leading to improved dialogue capabilities.
Performance Improvement: The original Wizard Vicuna 13B showed approximately a 7% performance improvement over Vicuna-13B in non-rigorous, GPT-4 scored benchmarks.

Good For

Applications requiring processing or generating long documents, articles, or extended dialogues.
Use cases where maintaining context over many turns in a conversation is crucial.
Developers looking for a 13B model with a large context window for GPU inference, with options for further quantization (GPTQ, GGML) available from TheBloke.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)