Name: TheBloke/Chronos-13B-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Model Overview

This model, Chronos-13B-SuperHOT-8K-fp16, is a 13 billion parameter LLaMA-based language model created by TheBloke. It is a merge of Elinas' Chronos 13B base model with Kaio Ken's SuperHOT 8K LoRA, specifically designed to enhance context handling. The model leverages a technique that allows for an extended context window of up to 8192 tokens during inference, significantly improving its ability to process and generate longer sequences of text.

Key Capabilities

Extended Context Window: Supports an 8K (8192 token) context length, enabling more coherent and detailed long-form text generation.
Merged Architecture: Combines the strengths of Elinas' Chronos 13B, known for chat, roleplay, and storywriting, with the context extension capabilities of SuperHOT 8K.
Versatile Generation: Capable of generating very long outputs with high coherence, suitable for creative writing, roleplay scenarios, and general chat.
Instruction Following: Utilizes Alpaca formatting for optimal performance, responding to ### Instruction: and ### Response: prompts.

Good For

Creative Writing: Generating detailed stories, narratives, and creative content.
Roleplay: Engaging in extended roleplay scenarios with a larger memory of the conversation.
Long-form Chat: Maintaining context over lengthy dialogues and discussions.
Research and Development: Serving as an unquantized fp16 base for further conversions or GPU inference, particularly for tasks requiring extended context.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)