google/shieldgemma-27b
ShieldGemma-27B is a 27 billion parameter, decoder-only large language model developed by Google, built upon the Gemma 2 architecture. It is specifically designed for safety content moderation, targeting four harm categories: sexually explicit, dangerous content, hate, and harassment. This English-only model functions as a text-to-text classifier, determining if input or output text violates defined safety policies, making it ideal for filtering user prompts and model responses.
Loading preview...
ShieldGemma-27B: A Specialized Content Moderation Model
ShieldGemma-27B, developed by Google, is a 27 billion parameter, decoder-only large language model built on the Gemma 2 architecture. It is specifically engineered for safety content moderation, focusing on identifying and classifying text related to four critical harm categories: sexually explicit content, dangerous content, hate speech, and harassment. This model operates in a text-to-text classification mode, determining whether a given input or output violates predefined safety policies.
Key Capabilities
- Targeted Harm Detection: Specializes in detecting and classifying content across four specific harm categories.
- Policy-Driven Classification: Utilizes a structured prompt pattern, acting as a "policy expert" to evaluate text against explicit safety principles.
- Input and Output Moderation: Capable of moderating both user-provided prompts (input filtering) and model-generated responses (output filtering).
- Probabilistic Scoring: Outputs a probability score indicating the likelihood of a 'Yes' or 'No' classification regarding policy violation.
- Open Weights: Available with open weights, facilitating integration and customization.
Good for
- Implementing Content Safety Filters: Ideal for developers needing to integrate robust safety checks into AI applications.
- Moderating User Inputs: Effectively screens user prompts to prevent the generation of harmful content.
- Filtering Model Outputs: Ensures AI-generated responses adhere to safety guidelines.
- Responsible AI Development: A core component of Google's Responsible Generative AI Toolkit, designed to enhance AI application safety.