Name: rishiskhare/gemma-3-promptshield API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rishiskhare

Model Overview

rishiskhare/gemma-3-promptshield is a specialized 0.3 billion parameter language model, fine-tuned from unsloth/gemma-3-270m-it using the Unsloth framework. Its core function is to identify and classify prompt injection attacks within user inputs, distinguishing between malicious and safe prompts.

Key Capabilities

Prompt Injection Detection: Classifies inputs as '1' (injection detected) or '0' (safe).
Security Filtering: Designed to improve the robustness and safety of large language model applications.
Red Teaming: Useful for analyzing and identifying potential prompt injection vulnerabilities in LLM systems.

Performance Metrics

Evaluated on the hendzh/PromptShield test set (2,940 samples), the model demonstrates strong performance in identifying prompt injections:

ROC AUC: 0.9652
Accuracy: 89.89%
F1 Score: 0.7990

Intended Use Cases

This model is ideal for developers and security researchers looking to integrate a dedicated prompt injection detection layer into their LLM-powered applications or for conducting security assessments.

Overview

Model Overview

Key Capabilities

Performance Metrics

Intended Use Cases

Full Model Card (README)