ellamind/sui-1-24b

VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Jan 14, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

ellamind/sui-1-24b is a specialized 24 billion parameter summarization model developed by ellamind, designed for high-quality, verifiable summarization of very long documents up to 2 million tokens. It features a unique source grounding mechanism, linking every summary claim to its original sentence via XML tags, significantly reducing hallucination risk. Optimized for single GPU deployment, it supports English, German, Spanish, French, and Italian, and excels at maintaining factual accuracy and coverage for complex texts.

Loading preview...

sui-1: Grounded and Verifiable Long-Form Summarization

sui-1 (Summarization with Unique Identifiers) by ellamind is a specialized 24 billion parameter model designed for high-quality summarization of extremely long documents, capable of processing up to 2 million tokens through an iterative approach. Its core differentiator is a built-in source grounding mechanism: every claim in the summary is linked via XML tags to its exact source sentence in the original text, enabling full traceability and significantly mitigating hallucination.

Key Capabilities

  • Verifiable Outputs: Each summary claim is traceable to its source sentence, ensuring factual accuracy and reducing hallucination.
  • Very Long Document Processing: Natively handles up to 128k tokens, with an iterative two-step approach extending support to documents up to 2 million tokens.
  • Efficient Deployment: The FP8 quantized variant can run on a single A100 40GB or A6000 48GB GPU, making it accessible for more modest hardware.
  • Multilingual Support: Fine-tuned for English, German, Spanish, French, and Italian, and inherits support for 20+ additional languages from its Mistral Small 3.2 base.
  • High-Quality Training: Trained on over 22,000 examples generated with a sophisticated chain-of-thought reasoning and multi-stage verification pipeline.

Good for

  • Enterprise-grade summarization: Ideal for applications requiring high factual accuracy and auditability, such as legal, financial, or research document analysis.
  • Processing extensive reports: Effectively summarizes very long texts like parliamentary documents, academic papers, or large web compilations.
  • Resource-constrained environments: The FP8 variant and iterative processing allow deployment on GPUs with limited VRAM, making advanced summarization more accessible.
  • Multilingual content analysis: Provides high-quality summarization across several major European languages.