google/shieldgemma-27b

TEXT GENERATIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kPublished:Jul 16, 2024License:gemmaArchitecture:Transformer0.0K Gated Cold

ShieldGemma-27B is a 27 billion parameter, decoder-only large language model developed by Google, built upon the Gemma 2 architecture. It is specifically designed for safety content moderation, targeting four harm categories: sexually explicit, dangerous content, hate, and harassment. This English-only model functions as a text-to-text classifier, determining if input or output text violates defined safety policies, making it ideal for filtering user prompts and model responses.

Loading preview...

ShieldGemma-27B: A Specialized Content Moderation Model

ShieldGemma-27B, developed by Google, is a 27 billion parameter, decoder-only large language model built on the Gemma 2 architecture. It is specifically engineered for safety content moderation, focusing on identifying and classifying text related to four critical harm categories: sexually explicit content, dangerous content, hate speech, and harassment. This model operates in a text-to-text classification mode, determining whether a given input or output violates predefined safety policies.

Key Capabilities

  • Targeted Harm Detection: Specializes in detecting and classifying content across four specific harm categories.
  • Policy-Driven Classification: Utilizes a structured prompt pattern, acting as a "policy expert" to evaluate text against explicit safety principles.
  • Input and Output Moderation: Capable of moderating both user-provided prompts (input filtering) and model-generated responses (output filtering).
  • Probabilistic Scoring: Outputs a probability score indicating the likelihood of a 'Yes' or 'No' classification regarding policy violation.
  • Open Weights: Available with open weights, facilitating integration and customization.

Good for

  • Implementing Content Safety Filters: Ideal for developers needing to integrate robust safety checks into AI applications.
  • Moderating User Inputs: Effectively screens user prompts to prevent the generation of harmful content.
  • Filtering Model Outputs: Ensures AI-generated responses adhere to safety guidelines.
  • Responsible AI Development: A core component of Google's Responsible Generative AI Toolkit, designed to enhance AI application safety.