Distil-PII-Llama-3.2-3B-Instruct Overview

Distil-PII-Llama-3.2-3B-Instruct is a 3.2 billion parameter small language model (SLM) developed by Distil Labs, fine-tuned from meta-llama/Llama-3.2-3B-Instruct. This model is specifically optimized for policy-aware PII redaction, designed to run efficiently in local environments. It processes plain-text inputs and consistently outputs a single JSON object containing the redacted_text and a detailed list of entities that were replaced.

Key Capabilities

Precise PII Redaction: Identifies and redacts various PII types including names, emails, phone numbers, addresses, SSNs, national IDs, UUIDs, credit card/IBAN last-4s, gender, age, race, and marital status.
Schema Adherence: Guarantees output in a strict JSON format, including redacted_text and an array of entities with value, replacement_token, and reason fields.
Operational Signal Preservation: Designed to remove identity information while retaining crucial operational data like order numbers, ticket IDs, and last-4 digits of financial identifiers.
High Accuracy: Achieves an evaluation score of **0.82

Good For

Support Chat & Log Redaction: Ideal for anonymizing customer support interactions, system logs, and incident tickets.
Transcript Processing: Suitable for redacting sensitive information from call transcripts and other spoken-word data.
Local Deployment: Optimized for running on-premises using frameworks like vLLM or Ollama, ensuring data privacy and control.
Structured Data Output: Provides a reliable JSON output, simplifying integration into automated data processing pipelines.