Name: bhenrym14/airophin-13b-pntk-16k-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: bhenrym14

Overview

bhenrym14/airophin-13b-pntk-16k-fp16 is a 13 billion parameter QLoRA fine-tune of the Llama-2-13b model, engineered to significantly extend its effective context window to 16384 tokens. This is achieved through a two-phase training process involving a long-context subset of the Dolphin dataset and Jon Durbin's Airoboros GPT4 1.4.1 dataset, both utilizing Partial NTK Rope Scaling.

Key Capabilities & Features

Extended Context Window: Designed for robust performance up to 16384 tokens, a substantial increase over the base Llama-2-13b's 4096 tokens.
Partial NTK Rope Scaling: Implements an advanced scaling method to improve long-context understanding and reduce "extrapolated deemphasis" of middle-context tokens.
Competitive Perplexity: Achieves lower perplexity scores at extended contexts (e.g., 4.82 at 12000 tokens) compared to other 13B models and even some 33B extended context variants.
Instruction Following: Retains Airoboros-like prompting for obedient question answering, coding, writing, and multi-character conversations, including specific formatting for closed-context instructions.
Performance Preservation: Shows no clear performance regression on benchmarks like MMLU-fs (scoring 54.9) despite the context extension, indicating effective scaling without sacrificing core capabilities.

Use Cases

This model is particularly well-suited for applications that demand processing or generating content with very long context dependencies. This includes tasks such as:

Summarizing lengthy documents or articles.
Engaging in extended, context-aware conversations.
Code generation and analysis where large codebases or detailed requirements are provided.
Any scenario where maintaining coherence and understanding over thousands of tokens is critical.

Overview

Overview

Key Capabilities & Features

Use Cases

Full Model Card (README)