saillab/x-guard
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 3, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

saillab/x-guard is a 3.1 billion parameter multilingual safety agent developed by saillab, designed for transparent content moderation across diverse linguistic contexts. This model effectively defends against conventional low-resource language attacks and sophisticated code-switching attacks. It combines a custom-finetuned mBART-50 translation module with an evaluation X-Guard 3B model, trained through supervised finetuning and GRPO, to detect unsafe content in 132 languages. Its primary strength lies in providing robust and transparent safety evaluations for LLMs and integrated systems.

Loading preview...