Overview

SelfLong-Llama3.2-1B-Instruct-1M is a 1 billion parameter instruction-tuned model, part of the SelfLong series, designed for processing exceptionally long contexts. Based on the Llama-3.2 architecture, this model is distinguished by its ability to manage up to 1 million tokens, making it suitable for applications requiring deep contextual understanding.

Key Capabilities

Extreme Context Length: Supports an impressive context window of up to 1 million tokens, significantly surpassing many conventional LLMs.
Instruction Following: Optimized for instruction-based tasks, leveraging its Llama-3.2-Instruct foundation.
Long-Context Reasoning: Evaluated and shown to perform effectively on the RULER-1M benchmark, which assesses long-context understanding across various support lengths.

Performance Highlights

On the RULER-1M benchmark, SelfLong-1B-1M demonstrates its long-context capabilities, achieving a RULER score of 31.1 at the 1M token support length. While larger SelfLong models (3B and 8B) show higher scores, this 1B variant provides a compact option for long-context applications.

Good For

Applications requiring processing and understanding of very long documents or conversations.
Tasks like summarization, question answering, or information extraction from extensive texts.
Developers seeking a smaller, efficient model capable of handling large context windows.

Overview

Overview

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)