Name: TheBloke/Manticore-13B-Chat-Pyg-Guanaco-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Overview

This model, Manticore-13B-Chat-Pyg-Guanaco-SuperHOT-8K-fp16, is a 13 billion parameter language model created by TheBloke. It is a merge of the Monero/Manticore-13b-Chat-Pyg-Guanaco base model with Kaio Ken's SuperHOT 8K LoRA. The primary distinguishing feature of this model is its extended context window of 8192 tokens, achieved through a specific RoPE scaling technique.

Key Capabilities

Extended Context: Supports an 8K (8192 token) context length, enabling the model to process and generate longer sequences of text while maintaining coherence.
FP16 Precision: Provided in fp16 PyTorch format, suitable for GPU inference and further conversions.
Merged Architecture: Combines the Manticore-13B-Chat-Pyg-Guanaco base with the SuperHOT 8K LoRA, integrating their respective strengths.

Good For

Long-form Content Generation: Ideal for applications requiring the model to maintain context over extensive dialogues or documents.
GPU Inference: Optimized for deployment on GPUs due to its fp16 PyTorch format.
Further Conversions: Serves as an unquantized base for users who wish to perform their own quantizations or modifications.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)