Name: hlyn-labs/prompt-injection-judge-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hlyn-labs

Overview

hlyn-labs/prompt-injection-judge-8b is an 8 billion parameter model developed by hlyn-labs, specifically engineered as a security judge to identify and mitigate prompt injection attacks in LLMs. It is fine-tuned on Hermes-3-Llama-3.1-8B using advanced techniques like ORPO (Odds Ratio Preference Optimization) and DoRA (Weight-Decomposed Low-Rank Adaptation).

Key Capabilities

Prompt Injection Detection: Designed strictly to detect and neutralize various LLM prompt injection attacks.
System-2 Reasoning: Utilizes a deliberative execution path, requiring internal chain-of-thought within <think> tags before finalizing a verdict, which significantly improves accuracy on complex edge cases.
Deterministic JSON Output: Outputs a structured JSON verdict including decision ("ALLOW" or "BLOCK"), confidence (0.0-1.0 float), and a reason.
Production-Grade: Built for integration into production security pipelines, offering robust defense mechanisms.
Optimized Formats: Available in defender-8b-Q8_0.gguf (8.5 GB) for Apple Silicon and local inference, and model-0000X-of-00004.safetensors (16 GB) for enterprise cloud deployments and vLLM.

Good For

Developers and organizations needing to secure their LLM applications against prompt injection.
Implementing a robust, automated security layer for LLM interactions.
Use cases requiring a highly calibrated and deterministic judgment on prompt safety.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)