Name: TheBloke/Vicuna-7B-v1-3-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Vicuna 7B v1.3 SuperHOT 8K fp16

This model, created by TheBloke, is a 7 billion parameter variant that combines LmSys' Vicuna 7B v1.3 with Kaio Ken's SuperHOT 8K LoRA. The primary innovation is its significantly extended context window, enabling processing of up to 8192 tokens. This is achieved by merging the SuperHOT 7B LoRA onto the base Vicuna model and utilizing specific inference configurations like trust_remote_code=True to activate the 8K context.

Key Capabilities

Extended Context Length: Supports an 8192-token context window, ideal for long-form conversations or document analysis.
Vicuna Base: Benefits from the conversational fine-tuning of the original Vicuna v1.3 model, which was trained on user-shared conversations from ShareGPT.
SuperHOT Integration: Incorporates the SuperHOT LoRA, which was trained with a focus on NSFW content and extended context techniques.

Good for

Research and Development: Primarily intended for researchers and hobbyists exploring large language models and chatbots.
Applications Requiring Long Context: Suitable for use cases where maintaining extensive conversational history or processing large text inputs is crucial.
Further Conversions: The fp16 pytorch format makes it suitable for further quantizations or conversions to other formats (e.g., GPTQ, GGML).

Overview

Vicuna 7B v1.3 SuperHOT 8K fp16

Key Capabilities

Good for

Full Model Card (README)