Learn

Step-by-step guides for document anonymization with anonym.plus.

Document anonymization is the process of detecting and replacing personally identifiable information (PII) in documents, images, and structured data files before those files are shared, archived, or submitted for legal or regulatory purposes. Done correctly, anonymization protects individuals' privacy and helps organizations meet their obligations under GDPR, HIPAA, and similar frameworks.

These guides cover every anonymization workflow supported by anonym.plus — from pasting a paragraph of text to processing hundreds of files in batch. Each guide is written for practitioners: lawyers handling discovery documents, HR teams exporting personnel records, healthcare administrators preparing datasets for research, and developers building GDPR-compliant pipelines.

Two anonymization approaches

anonym.plus offers two fundamentally different ways to handle detected PII, and choosing between them shapes your entire workflow:

Replace (irreversible)Substitutes PII with labelled placeholders like [PERSON] or [EMAIL_ADDRESS]. The original value is gone. Use this for final redaction — court filings, public datasets, vendor submissions.
Encrypt (reversible)Replaces PII with AES-256-GCM encrypted tokens using a key stored in your local vault. The original value can be recovered at any time by the key holder. Use this for internal sharing, staging environments, or workflows where you need to re-identify later.

What the guides cover

The six guides below move from simple to advanced. Start with the text anonymization guide if you are new to the tool — it covers the Replace vs Encrypt decision in detail. Move to the file guide once you are comfortable with text workflows. The GDPR compliance guide is the most referenced guide for legal and compliance teams and covers the regulatory basis for each anonymization decision.

All guides assume you are running anonym.plus desktop with the local Presidio engine active. No cloud account or internet connection is required for any workflow described here — all processing happens on your machine.

How to Anonymize Text: Replace vs Encrypt
Compare the two text anonymization methods — Replace for irreversible placeholders and Encrypt for reversible AES-256-GCM protection. Learn when to use each approach.
📄
How to Anonymize PDF, DOCX, and XLSX Files
Format-specific guidance for document anonymization. Learn what works per format, structure preservation, and size limits.
📷
How to Anonymize Images with OCR-Based Detection
Understand the OCR pipeline for image anonymization: Tesseract text extraction, NER-based PII detection, and redaction of sensitive regions.
📚
Batch Processing: Anonymize Multiple Documents at Once
Learn when to use batch vs single-file processing, how to choose between Replace and Encrypt for bulk operations, and workflow best practices.
🔑
Encryption Keys & Reversible Anonymization
Complete guide to encryption key lifecycle: generation, use during anonymization, key rotation, and recovery via BIP39 mnemonic phrases.
🏳
GDPR-Compliant Document Anonymization
Compliance checklist: what GDPR requires for document anonymization and how anonym.plus satisfies each requirement with offline, zero-knowledge processing.