Overview
Distil-PII-Llama-3.2-3B-Instruct Overview
Distil-PII-Llama-3.2-3B-Instruct is a 3.2 billion parameter small language model (SLM) developed by Distil Labs, fine-tuned from meta-llama/Llama-3.2-3B-Instruct. This model is specifically optimized for policy-aware PII redaction, designed to run efficiently in local environments. It processes plain-text inputs and consistently outputs a single JSON object containing the redacted_text and a detailed list of entities that were replaced.
Key Capabilities
- Precise PII Redaction: Identifies and redacts various PII types including names, emails, phone numbers, addresses, SSNs, national IDs, UUIDs, credit card/IBAN last-4s, gender, age, race, and marital status.
- Schema Adherence: Guarantees output in a strict JSON format, including
redacted_textand an array ofentitieswithvalue,replacement_token, andreasonfields. - Operational Signal Preservation: Designed to remove identity information while retaining crucial operational data like order numbers, ticket IDs, and last-4 digits of financial identifiers.
- High Accuracy: Achieves an evaluation score of **0.82
Good For
- Support Chat & Log Redaction: Ideal for anonymizing customer support interactions, system logs, and incident tickets.
- Transcript Processing: Suitable for redacting sensitive information from call transcripts and other spoken-word data.
- Local Deployment: Optimized for running on-premises using frameworks like vLLM or Ollama, ensuring data privacy and control.
- Structured Data Output: Provides a reliable JSON output, simplifying integration into automated data processing pipelines.