nvidia/Privasis-Cleaner-4B

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 8, 2026License:otherArchitecture:Transformer0.0K Cold

Privasis-Cleaner-4B is a 4 billion parameter decoder-only Transformer model developed by NVIDIA, built upon Qwen3 4B Instruct. It is specifically fine-tuned for text sanitization, capable of removing or abstracting sensitive information based on user-provided instructions. This model excels at preprocessing text for privacy-preserving research, content sanitization, and compliance pipelines by generating cleaned versions of raw text.

Loading preview...

Overview

Privasis-Cleaner-4B is a 4 billion parameter text-sanitization model developed by NVIDIA, based on the Qwen3 4B Instruct architecture. Its core function is to remove or abstract sensitive information from text according to user-defined sanitization instructions. The model was fine-tuned on 37,000 instruction–input–output triplets, enabling it to produce compliant, cleaned text.

Key Capabilities

  • Instruction-driven Sanitization: Users provide specific instructions (e.g., "Remove all person names, exact dates, and exact locations") to guide the sanitization process.
  • Privacy-Preserving: Designed for automatic redaction of PII/PHI, making it suitable for sensitive data handling.
  • Lightweight: At 4 billion parameters, it offers a balance between performance and computational efficiency.
  • Synthetic Data Training: Trained and tested on synthetic text-based triplets, ensuring no personal data was used in its development.

Use Cases

  • Data Preprocessing: Ideal for preparing datasets for privacy-preserving research.
  • Content Moderation: Sanitizing content to meet compliance standards (e.g., GDPR, HIPAA).
  • Automated Redaction: Automatically removing sensitive entities from text streams or documents.

This model is intended for research and non-commercial use, with deployment supported globally. Further details on its underlying research can be found in the Privasis paper.