Nemotron 3 Content Safety Model Overview
The Nemotron 3 Content Safety model, developed by NVIDIA, is a 4.3 billion parameter Large Language Model (LLM) classifier built upon Google's Gemma-3-4B-it base. It is specifically fine-tuned for multimodal and multilingual content safety, acting as a moderator for inputs (text and optional images) and responses from both LLMs and VLMs. This model extends the capabilities of previous Nemoguard models by incorporating image analysis and broader language support.
Key Capabilities
- Multimodal Content Safety: Evaluates the safety of both text prompts and associated images.
- Multilingual Support: Supports 12 languages, including English, Arabic, German, Spanish, French, Hindi, Japanese, Thai, Dutch, Italian, Korean, and Chinese.
- Input and Response Moderation: Can assess the safety of user prompts (with or without images) and generated AI responses.
- Detailed Safety Categorization: Optionally returns specific safety categories violated (e.g., Violence, Sexual, Criminal Planning) based on a comprehensive taxonomy.
- Commercial Use Ready: Licensed under the NVIDIA Nemotron Open Model License, Gemma Terms of Use, and Gemma Prohibited Use Policy.
Good For
- Moderating LLM/VLM Applications: Ideal for integrating content safety checks into AI systems that handle text and image inputs/outputs.
- Multilingual AI Deployments: Suitable for applications requiring content moderation across diverse language user bases.
- Identifying Specific Harms: Useful for developers who need to not only detect unsafe content but also understand the specific nature of the violation.
- Reducing False Positives: Evaluated on general purpose benchmarks (MMMU, DocVQA, AI2D) to demonstrate low false positive rates for safe inputs.