Ostrakon/Ostrakon-VL-8B

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Ostrakon/Ostrakon-VL-8B is an 8 billion parameter multimodal large language model (MLLM) developed by Ostrakon-VL, specifically engineered for Food-Service and Retail Store (FSRS) scenarios. This model excels at real-world retail perception, compliance, and decision-making tasks, outperforming larger general-purpose MLLMs on domain-specific benchmarks. It features a 32768 token context length and is optimized for complex visual understanding in cluttered retail environments.

Loading preview...

Ostrakon-VL-8B: Domain-Expert MLLM for Retail

Ostrakon-VL-8B is an 8 billion parameter Multimodal Large Language Model (MLLM) developed by Ostrakon-VL, uniquely designed for Food-Service and Retail Store (FSRS) applications. It is part of the Ostrakon-VL family, which includes 8B and 30B variants, and is notable for its specialized performance in retail perception and compliance tasks.

Key Capabilities & Features

  • FSRS Specialization: The first open-source MLLM explicitly designed for the complexities of food-service and retail environments.
  • ShopBench Performance: Achieves a 60.1 average score on ShopBench, a novel public benchmark for FSRS scenarios, demonstrating competitive results among open-source models of similar scale.
  • Multi-format Input Handling: Capable of processing single-image, multi-image, and video inputs for comprehensive retail analysis.
  • High Visual Complexity: Designed to handle scenes with high instance density (13.0 objects/image), typical of cluttered retail settings.
  • Advanced Training Strategy: Utilizes a multi-stage training approach including Caption Bootstrapping (CB) for domain knowledge injection, Offline Curriculum Learning (OCL) for difficulty-stratified training, and Mixed Preference Optimization (MPO) for aligning responses with rule-compliant reasoning.

When to Use Ostrakon-VL-8B

  • Retail Analytics: Ideal for tasks such as shelf compliance checks, inventory monitoring, store layout analysis, and identifying potential issues in retail environments.
  • Food Service Operations: Suitable for applications like kitchen hygiene audits, food preparation monitoring, and general operational oversight in restaurants and cafes.
  • Domain-Specific MLLM Research: Provides a specialized model and benchmark (ShopBench) for researchers focusing on MLLMs in niche, visually complex domains.

While specialized, Ostrakon-VL-8B also shows competitive performance on general multimodal benchmarks, indicating broad multimodal capabilities alongside its domain expertise.