inclusionAI/Sing-Guard-2b

VISIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 25, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

SingGuard-2b is a 2 billion parameter multimodal LLM guardrail developed by inclusionAI, designed for policy-adaptive safety assessment across text, image, and multilingual content. It uniquely treats safety policies as runtime inputs, allowing dynamic evaluation against custom natural-language rules without retraining. This model excels at unified multimodal moderation and provides strong benchmark performance in identifying and categorizing risks in user queries and model responses.

Loading preview...

SingGuard-2b: Policy-Adaptive Multimodal Guardrail

SingGuard-2b, developed by inclusionAI, is a 2 billion parameter multimodal large language model specifically engineered as a guardrail for content safety. Unlike traditional guardrails with fixed taxonomies, SingGuard-2b allows for runtime policy adaptation, meaning deployment teams can define and apply custom natural-language safety rules without needing to retrain the model. This flexibility enables dynamic moderation against evolving risk landscapes.

Key Capabilities

  • Unified Multimodal Moderation: Assesses safety across diverse inputs including text, images, image-text combinations, and multilingual content, covering both user queries and model responses.
  • Dynamic Reasoning Flow: Features a fast-slow mode for immediate safety signals, followed by deeper reasoning for precise judgments, outputting both a binary safe/unsafe verdict and a matched risk category.
  • Strong Benchmark Performance: Achieves state-of-the-art average performance across six major benchmark categories for multimodal, image-only, text query, text response, and multilingual safety.
  • Native Inference Compatibility: Integrates seamlessly with standard Transformers and vLLM chat-style message inputs.

Good For

  • Developers needing a flexible, adaptable safety layer for multimodal AI applications.
  • Moderation systems requiring dynamic policy updates without model retraining.
  • Identifying and categorizing risks in user-generated content or AI-generated responses across various modalities and languages.