Peeepy/Airoboros-13b-SuperHOT-8k
Peeepy/Airoboros-13b-SuperHOT-8k is a 13 billion parameter language model created by Peeepy, merging the Airoboros 13b GPT4 1.4 model with kaiokendev's SuperHOT 8k LoRA. This model is specifically designed for extended context handling, supporting an effective context length of up to 8192 tokens through a required monkey patch. It is optimized for tasks benefiting from longer conversational history and detailed document processing.
Loading preview...
Peeepy/Airoboros-13b-SuperHOT-8k Overview
Peeepy/Airoboros-13b-SuperHOT-8k is a 13 billion parameter language model that combines the instruction-following capabilities of Airoboros 13b GPT4 1.4 with the extended context handling of kaiokendev's SuperHOT 8k LoRA. This merge aims to provide a robust model capable of processing significantly longer sequences than its base models.
Key Capabilities
- Extended Context Length: Designed to operate with an effective context window of up to 8192 tokens, significantly enhancing its ability to maintain coherence and recall information over long interactions or documents. This is achieved through a specific monkey patch that adjusts
max_position_embeddingsand frequency steps. - Merged Architecture: Leverages the strengths of both Airoboros for instruction tuning and SuperHOT for context extension, potentially offering improved performance in tasks requiring both detailed understanding and long-range dependencies.
Usage Notes
To fully utilize the 8K context length, a specific monkey patch is required. This patch modifies the model's configuration to stretch sinusoidal position embeddings and adjust frequency steps. Without this patch, the model will not function with the intended extended context. Users can find the necessary patch and instructions within the model's repository or associated resources.