TheBloke/Chronos-13B-SuperHOT-8K-fp16
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer0.0K Cold

TheBloke/Chronos-13B-SuperHOT-8K-fp16 is a 13 billion parameter LLaMA-based model, merged from Elinas' Chronos 13B and Kaio Ken's SuperHOT 8K LoRA. This fp16 PyTorch model is optimized for extended context, supporting up to 8192 tokens for coherent long-form generation. It excels in chat, roleplay, and storywriting, while also handling simple reasoning and coding tasks.

Loading preview...

Model Overview

This model, Chronos-13B-SuperHOT-8K-fp16, is a 13 billion parameter LLaMA-based language model created by TheBloke. It is a merge of Elinas' Chronos 13B base model with Kaio Ken's SuperHOT 8K LoRA, specifically designed to enhance context handling. The model leverages a technique that allows for an extended context window of up to 8192 tokens during inference, significantly improving its ability to process and generate longer sequences of text.

Key Capabilities

  • Extended Context Window: Supports an 8K (8192 token) context length, enabling more coherent and detailed long-form text generation.
  • Merged Architecture: Combines the strengths of Elinas' Chronos 13B, known for chat, roleplay, and storywriting, with the context extension capabilities of SuperHOT 8K.
  • Versatile Generation: Capable of generating very long outputs with high coherence, suitable for creative writing, roleplay scenarios, and general chat.
  • Instruction Following: Utilizes Alpaca formatting for optimal performance, responding to ### Instruction: and ### Response: prompts.

Good For

  • Creative Writing: Generating detailed stories, narratives, and creative content.
  • Roleplay: Engaging in extended roleplay scenarios with a larger memory of the conversation.
  • Long-form Chat: Maintaining context over lengthy dialogues and discussions.
  • Research and Development: Serving as an unquantized fp16 base for further conversions or GPU inference, particularly for tasks requiring extended context.