OriOn-Mistral: Long-Context Visual Document and Text Understanding

OriOn-Mistral is a 24 billion parameter model developed by LightOn, based on the Mistral-Small-3.1-24B-Instruct architecture. It has been specifically trained using CPT (Contextualized Pre-training) and SFT (Supervised Fine-Tuning) on synthetic data to excel in long-context visual document performance (such as PDF VQA and multi-page reasoning) and significantly boost text long-context capabilities.

Key Capabilities and Features

Extended Context Length: Supports an impressive 344K context length, trained on documents up to 336 pages, making it suitable for processing very long inputs.
Visual Document Understanding: Achieves substantial gains in long-document VQA, scoring 46.6 on MMLongBenchDoc (+16.8% compared to baseline).
Text Long-Context Performance: Demonstrates strong improvements in text long-context tasks, with a score of 53.1 on HELMET (+43.5% compared to baseline).
Reproducibility: Training recipes and ablations are available via the OriOn-Leaderboard.
Efficient Serving: Designed for drop-in serving with vLLM.

Intended Use Cases

OriOn-Mistral is particularly well-suited for applications requiring:

Long PDF / slide-deck QA and understanding: Performing one-shot question answering over entire documents.
Complex long-context reasoning: Leveraging its visual and text long-context training for enhanced reasoning across extensive materials.

Overview

OriOn-Mistral: Long-Context Visual Document and Text Understanding

Key Capabilities and Features

Intended Use Cases

Full Model Card (README)