TheBloke/Robin-7B-v2-SuperHOT-8K-fp16
TheBloke/Robin-7B-v2-SuperHOT-8K-fp16 is a 7 billion parameter causal language model, created by OptimalScale and further developed by Kaio Ken and TheBloke. It merges OptimalScale's Robin 7B v2 with Kaio Ken's SuperHOT 8K LoRA, extending its context window to 8192 tokens. This model is specifically fine-tuned for NSFW content and is designed for GPU inference, offering an expanded context length for more extensive conversational or generative tasks.
Loading preview...
Model Overview
This model, TheBloke/Robin-7B-v2-SuperHOT-8K-fp16, is a 7 billion parameter language model based on OptimalScale's Robin 7B v2. It has been enhanced by merging with Kaio Ken's SuperHOT 8K LoRA, which significantly extends its context window to 8192 tokens. This allows for processing and generating much longer sequences of text compared to its base model's original 4096 token context.
Key Capabilities
- Extended Context Window: Achieves an 8K context length, enabling more coherent and detailed long-form generation and understanding.
- NSFW Focus: The SuperHOT LoRA was specifically trained with a focus on NSFW content.
- FP16 Precision: Provided in fp16 PyTorch format, suitable for GPU inference and further conversions.
- Flexible Configuration: The
config.jsonis set to 8192 sequence length by default, but can be adjusted.
Usage Notes
- Requires
trust_remote_code=Truefor proper context scaling during inference. - For
exllamaorexllama_hfloaders, use--max_seq_len 8192 --compress_pos_emb 4arguments. - The model is a merge of
OptimalScale/robin-7b-v2-deltaandkaiokendev/superhot-7b-8k-no-rlhf-test.