gsaivinay/wizard-vicuna-13B-SuperHOT-8K-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer Cold

gsaivinay/wizard-vicuna-13B-SuperHOT-8K-fp16 is a 13 billion parameter language model, a merge of June Lee's Wizard Vicuna 13B and Kaio Ken's SuperHOT 8K LoRA. This model is optimized for extended context, supporting an 8K context window during inference. It is designed for general conversational AI tasks requiring longer memory and improved conversational flow.

Loading preview...

Overview

This model, gsaivinay/wizard-vicuna-13B-SuperHOT-8K-fp16, is a 13 billion parameter language model created by merging June Lee's Wizard Vicuna 13B with Kaio Ken's SuperHOT 8K LoRA. It is provided in fp16 PyTorch format, making it suitable for GPU inference and further conversions. A key feature is its extended context window of 8K tokens, achieved through the SuperHOT merge and specific configuration settings.

Key Capabilities

  • Extended Context: Leverages an 8K context window, significantly improving the model's ability to handle longer conversations and more complex prompts compared to standard 4K context models.
  • Enhanced Conversational Ability: Built upon Wizard Vicuna, which combines WizardLM's deep dataset expansion with Vicuna's multi-round conversation tuning, leading to more coherent and engaging dialogues.
  • Performance Improvement: The original Wizard Vicuna 13B showed approximately a 7% performance improvement over VicunaLM in GPT-4 scored evaluations.

Good For

  • Applications requiring long-form text generation or extended conversational memory.
  • Developers looking for a 13B model with improved context handling for complex tasks.
  • Use cases where the base Wizard Vicuna's conversational strengths are beneficial, now with an expanded context.