Name: TheBloke/GPT4All-13B-Snoozy-SuperHOT-8K-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Model Overview

This model, GPT4All-13B-Snoozy-SuperHOT-8K-fp16, is a 13 billion parameter Llama-based language model. It is a merge of Nomic AI's GPT4All Snoozy 13B with Kaio Ken's SuperHOT 8K LoRA, specifically provided in fp16 PyTorch format for GPU inference.

Key Capabilities

Extended Context Window: Achieves an 8192-token context length during inference by leveraging the SuperHOT 8K merge and trust_remote_code=True in Hugging Face Transformers.
Common Sense Reasoning: The base GPT4All 13B Snoozy model shows strong performance on common sense reasoning benchmarks, outperforming several other 7B and 13B models in categories like BoolQ and WinoGrande.
Instruction Following: Finetuned on a curated corpus of assistant interactions, including multi-turn dialogue, code, poems, and stories, indicating proficiency in instruction-tuned tasks.

Good For

Applications requiring a larger context window for more coherent and extended interactions.
Tasks benefiting from strong common sense reasoning capabilities.
Developers looking for an fp16 PyTorch model for GPU inference, with options for further quantization (GPTQ, GGML) available from TheBloke.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)