Name: TheBloke/guanaco-13B-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Overview

This model, guanaco-13B-SuperHOT-8K-fp16, is a 13 billion parameter language model created by TheBloke. It is a merge of two distinct components: Tim Dettmers' Guanaco 13B base model and Kaio Ken's SuperHOT 8K LoRA. The integration of the SuperHOT 8K LoRA is specifically designed to extend the model's effective context window to 8192 tokens, a significant increase over standard models.

Key Capabilities

Extended Context Window: Achieves an 8K (8192 token) context length, enabling the model to process and generate longer sequences of text.
FP16 Precision: Provided in fp16 PyTorch format, optimized for GPU inference and further conversions.
Merged Architecture: Combines the strengths of the Guanaco 13B base with the context-extending capabilities of the SuperHOT 8K LoRA.

Training Details (SuperHOT LoRA)

The SuperHOT LoRA was trained with specific configurations to achieve its extended context capabilities:

1200 samples, with approximately 400 samples over a 2048 sequence length.
Learning rate of 3e-4 over 3 epochs.
LoRA modules included q_proj, k_proj, v_proj, o_proj with a rank of 4 and alpha of 8.
Trained on a 4-bit base model.

Good For

Applications requiring processing or generating long texts, such as detailed summaries, extended conversations, or document analysis.
Developers looking for a 13B parameter model with enhanced context handling for GPU-based inference.

Overview

Overview

Key Capabilities

Training Details (SuperHOT LoRA)

Good For

Full Model Card (README)