PolyGuard-Qwen-Smol is a multilingual safety moderation tool developed by Priyanshu Kumar et al. for safeguarding Large Language Model (LLM) generations across 17 languages. This model is trained on PolyGuardMix, the largest multilingual safety training corpus to date with 1.91M samples, and excels at classifying prompt harmfulness, response harmfulness, and response refusal. It outperforms existing state-of-the-art open-weight and commercial safety classifiers by 5.5%, making it ideal for robust, multilingual content moderation.
No reviews yet. Be the first to review!