TheBloke/Chronos-13B-SuperHOT-8K-fp16 is a 13 billion parameter LLaMA-based model, merged from Elinas' Chronos 13B and Kaio Ken's SuperHOT 8K LoRA. This fp16 PyTorch model is optimized for extended context, supporting up to 8192 tokens for coherent long-form generation. It excels in chat, roleplay, and storywriting, while also handling simple reasoning and coding tasks.
Loading preview...
Model Overview
This model, Chronos-13B-SuperHOT-8K-fp16, is a 13 billion parameter LLaMA-based language model created by TheBloke. It is a merge of Elinas' Chronos 13B base model with Kaio Ken's SuperHOT 8K LoRA, specifically designed to enhance context handling. The model leverages a technique that allows for an extended context window of up to 8192 tokens during inference, significantly improving its ability to process and generate longer sequences of text.
Key Capabilities
- Extended Context Window: Supports an 8K (8192 token) context length, enabling more coherent and detailed long-form text generation.
- Merged Architecture: Combines the strengths of Elinas' Chronos 13B, known for chat, roleplay, and storywriting, with the context extension capabilities of SuperHOT 8K.
- Versatile Generation: Capable of generating very long outputs with high coherence, suitable for creative writing, roleplay scenarios, and general chat.
- Instruction Following: Utilizes Alpaca formatting for optimal performance, responding to
### Instruction:and### Response:prompts.
Good For
- Creative Writing: Generating detailed stories, narratives, and creative content.
- Roleplay: Engaging in extended roleplay scenarios with a larger memory of the conversation.
- Long-form Chat: Maintaining context over lengthy dialogues and discussions.
- Research and Development: Serving as an unquantized fp16 base for further conversions or GPU inference, particularly for tasks requiring extended context.