Name: TheBloke/Wizard-Vicuna-7B-Uncensored-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Model Overview

This model, TheBloke/Wizard-Vicuna-7B-Uncensored-SuperHOT-8K-fp16, is a 7 billion parameter language model in fp16 PyTorch format. It is a merge of Eric Hartford's Wizard Vicuna 7B Uncensored and Kaio Ken's SuperHOT 8K.

Key Capabilities

Extended Context Window: Achieves an 8192-token context length during inference by utilizing trust_remote_code=True and a modified config.json.
Uncensored Responses: Based on Eric Hartford's Wizard Vicuna 7B Uncensored, which was trained with alignment/moralizing responses removed, allowing for unconstrained output.
GPU Inference: Provided in fp16 PyTorch format, optimized for direct use on GPUs.
SuperHOT Integration: Incorporates Kaio Ken's SuperHOT 7b LoRA, which was trained with a focus on NSFW content and extended context capabilities.

Good For

Applications requiring long-context understanding and generation.
Use cases where uncensored or unfiltered responses are desired.
Developers looking for a base model for further fine-tuning or experimentation with custom alignment.
Scenarios benefiting from the SuperHOT LoRA's specific training focus.

For CPU inference or different quantization levels, refer to the available 4-bit GPTQ models and 2, 3, 4, 5, 6 and 8-bit GGML models.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)