Overview
Meta-SecAlign-70B is a 70 billion parameter LoRA adapter developed by facebook, specifically fine-tuned from Llama-3.3-70B-Instruct to provide robust defense against prompt injection attacks. Prompt injection is a critical security threat for LLM-integrated applications, and Meta-SecAlign aims to provide an open-source solution for secure LLM deployment in agentic and security-sensitive contexts. A smaller 8B version, Meta-SecAlign-8B, is also available for resource-constrained environments.
Key Capabilities
- Prompt Injection Robustness: Achieves state-of-the-art defense against prompt injection, significantly reducing attack success rates (ASR) across benchmarks like AlpacaFarm (0.5% ASR vs. 95.7% for base Llama-3.3-70B-Instruct) and InjecAgent (0.5% ASR vs. 53.8%).
- Maintained Utility: While focused on security, it largely preserves the general utility of its base model, Llama-3.3-70B-Instruct, across benchmarks like MMLU and BBH, with only minor performance deltas.
- Agentic Workflow Security: Demonstrates improved success rates in agentic workflows under attack conditions (e.g., AgentDojo success rate of 79.5% with attack vs. 43.4% for base model).
- Commercial Usage: The model is ready for commercial usage, subject to its specific license terms.
When to Use
Meta-SecAlign-70B is ideal for developers building LLM-integrated applications, especially those interacting with untrusted external data or operating in security-sensitive environments where prompt injection is a concern. It is particularly well-suited for agentic workflows requiring high resilience against malicious manipulation. Users should enclose untrusted data within the new "input" role in conversations to leverage its defensive capabilities.