inclusionAI/Sing-Guard-2b
SingGuard-2b is a 2 billion parameter multimodal LLM guardrail developed by inclusionAI, designed for policy-adaptive safety assessment across text, image, and multilingual content. It uniquely treats safety policies as runtime inputs, allowing dynamic evaluation against custom natural-language rules without retraining. This model excels at unified multimodal moderation and provides strong benchmark performance in identifying and categorizing risks in user queries and model responses.
Loading preview...
SingGuard-2b: Policy-Adaptive Multimodal Guardrail
SingGuard-2b, developed by inclusionAI, is a 2 billion parameter multimodal large language model specifically engineered as a guardrail for content safety. Unlike traditional guardrails with fixed taxonomies, SingGuard-2b allows for runtime policy adaptation, meaning deployment teams can define and apply custom natural-language safety rules without needing to retrain the model. This flexibility enables dynamic moderation against evolving risk landscapes.
Key Capabilities
- Unified Multimodal Moderation: Assesses safety across diverse inputs including text, images, image-text combinations, and multilingual content, covering both user queries and model responses.
- Dynamic Reasoning Flow: Features a fast-slow mode for immediate safety signals, followed by deeper reasoning for precise judgments, outputting both a binary
safe/unsafeverdict and a matched risk category. - Strong Benchmark Performance: Achieves state-of-the-art average performance across six major benchmark categories for multimodal, image-only, text query, text response, and multilingual safety.
- Native Inference Compatibility: Integrates seamlessly with standard Transformers and vLLM chat-style message inputs.
Good For
- Developers needing a flexible, adaptable safety layer for multimodal AI applications.
- Moderation systems requiring dynamic policy updates without model retraining.
- Identifying and categorizing risks in user-generated content or AI-generated responses across various modalities and languages.