eternisai/Anonymizer-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Aug 27, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

The eternisai/Anonymizer-4B is a 4 billion parameter language model based on Qwen3, specifically fine-tuned for high-accuracy PII anonymization. It excels at semantically similar replacement of personal data, achieving a 9.55/10 anonymization quality score. This model is designed for critical anonymization tasks in enterprise and research, offering performance comparable to much larger models.

Loading preview...

Model Overview

The eternisai/Anonymizer-4B is a 4 billion parameter language model, part of the Enchanted anonymizer series, developed by eternisai. Built upon the Qwen3-4B architecture, this model is specifically designed for high-accuracy anonymization of Personally Identifiable Information (PII).

Key Capabilities

  • High-Accuracy PII Replacement: The model identifies and replaces PII with semantically equivalent alternatives, preserving context while enhancing privacy. It achieves a 9.55/10 score on anonymization quality.
  • Efficient Performance: Despite its strong performance, it offers low latency, with Time To First Token (TTFT) under 250ms and full completion under 2 seconds when quantized.
  • Structured Output: It generates structured JSON outputs via tool calls, detailing original PII and its anonymized replacements.

Training Details

Anonymizer-4B was trained using Supervised Fine-Tuning (SFT) followed by GRPO (Generative Reinforcement Learning from PPO) with GPT-4.1 acting as the judge. The training dataset comprised approximately 30,000 samples covering various PII replacement and non-replacement scenarios.

Intended Use Cases

  • Primary: Integrated as a high-accuracy anonymizer within the Enchanted platform.
  • Secondary: Suitable for enterprise and research deployments where top-tier anonymization quality is critical.

Important Usage Notes

  • Chat Template Required: The model necessitates the use of tokenizer.apply_chat_template() with a specific tool schema; raw prompts are not supported.
  • Special Marker: User queries must include the /no_think marker for proper PII detection.

Limitations

As the largest model in its series, Anonymizer-4B requires MacBook-class hardware or above for real-time inference and is not optimized for mobile devices as of August 2025.