Name: TheBloke/Selfee-7B-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Model Overview

This model, TheBloke/Selfee-7B-SuperHOT-8K-fp16, is a 7 billion parameter language model resulting from the merge of Kaist AI's Selfee 7B base model with Kaio Ken's SuperHOT 8K LoRA. It is distributed in fp16 PyTorch format, optimized for GPU inference.

Key Capabilities

Extended Context Window: Achieves an 8K (8192 tokens) context length during inference, enabled by the SuperHOT 8K merge and trust_remote_code=True functionality.
NSFW Focus: The SuperHOT LoRA component was specifically trained with a NSFW focus, making this model potentially suitable for applications requiring such content generation.
Flexible Configuration: The config.json is set to 8192 sequence length by default, but can be adjusted to 4096 if a smaller sequence length is desired.
Conversion Ready: The fp16 PyTorch format serves as a base for further conversions into other formats like GPTQ (4-bit) or GGML (2-8 bit) for various hardware setups.

Good for

Applications requiring a 7B parameter model with an extended 8K context window.
Use cases that benefit from a model with a NSFW-focused fine-tuning.
Developers looking for a PyTorch fp16 model for GPU inference or as a starting point for custom quantizations and conversions.

Overview

Model Overview

Key Capabilities

Good for

Full Model Card (README)