self-long/SelfLong-Llama3.2-1B-Instruct-1M

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Mar 17, 2025License:llama3.2Architecture:Transformer0.0K Warm

SelfLong-Llama3.2-1B-Instruct-1M is a 1 billion parameter instruction-tuned language model from the SelfLong series, initialized from the Llama-3.2 architecture. Developed by Wang et al., this model is specifically engineered to handle extremely long contexts, supporting up to 1 million tokens. It excels in tasks requiring extensive context understanding, as demonstrated by its performance on the RULER-1M benchmark.

Loading preview...

Overview

SelfLong-Llama3.2-1B-Instruct-1M is a 1 billion parameter instruction-tuned model, part of the SelfLong series, designed for processing exceptionally long contexts. Based on the Llama-3.2 architecture, this model is distinguished by its ability to manage up to 1 million tokens, making it suitable for applications requiring deep contextual understanding.

Key Capabilities

  • Extreme Context Length: Supports an impressive context window of up to 1 million tokens, significantly surpassing many conventional LLMs.
  • Instruction Following: Optimized for instruction-based tasks, leveraging its Llama-3.2-Instruct foundation.
  • Long-Context Reasoning: Evaluated and shown to perform effectively on the RULER-1M benchmark, which assesses long-context understanding across various support lengths.

Performance Highlights

On the RULER-1M benchmark, SelfLong-1B-1M demonstrates its long-context capabilities, achieving a RULER score of 31.1 at the 1M token support length. While larger SelfLong models (3B and 8B) show higher scores, this 1B variant provides a compact option for long-context applications.

Good For

  • Applications requiring processing and understanding of very long documents or conversations.
  • Tasks like summarization, question answering, or information extraction from extensive texts.
  • Developers seeking a smaller, efficient model capable of handling large context windows.