Overview
This model, TheBloke/Pygmalion-7B-SuperHOT-8K-fp16, is an unquantized 7 billion parameter model in fp16 PyTorch format. It is a merge by TehVenom of the original Pygmalion 7B with Kaio Ken's SuperHOT 8K LoRA. The integration of SuperHOT 8K extends the model's context window to 8192 tokens, enabling more extensive and coherent conversations.
Key Capabilities
- Extended Context: Supports an 8K context length, allowing for longer and more detailed conversational memory.
- Dialogue Optimization: Fine-tuned for conversational AI, specifically for generating character-driven dialogue.
- Roleplay Focus: Inherits Pygmalion's training on persona and chat formats, making it suitable for fictional conversation and roleplay scenarios.
- Unquantized Format: Provided in fp16 PyTorch format, ideal for GPU inference and further model conversions.
Good for
- Fictional Conversation: Excels at generating engaging and in-character dialogue for entertainment purposes.
- Roleplay Applications: Designed to handle persona-based interactions and maintain conversational context over extended turns.
- Developers requiring high precision: The fp16 format is suitable for those who need full precision for inference or further fine-tuning.
Important Notes
- The model is not fine-tuned to be safe and harmless; it may produce socially unacceptable or offensive text due to its training data, which includes profanity and lewd content.
- Outputs may be factually incorrect or misleading.
- To achieve the 8K context length during inference,
trust_remote_code=True must be used, or a provided monkey patch can be applied for custom Python UIs.