GDPR and Personal Data
The General Data Protection Regulation (GDPR) defines personal data broadly under Article 4. Any information that can directly or indirectly identify a natural person qualifies as personal data: names, email addresses, phone numbers, location data, online identifiers (IP addresses, cookies), financial data (IBANs, credit card numbers), national identification numbers, and much more.
Organizations processing personal data must comply with strict principles: lawfulness, purpose limitation, data minimization, accuracy, storage limitation, and integrity. Failure to comply can result in fines of up to 4% of annual global turnover or 20 million euros, whichever is higher.
anonym.plus detects over 340 entity types covering all categories of GDPR-relevant PII. The detection engine uses Microsoft Presidio combined with spaCy NER models, supplemented by pattern-based recognizers for structured data like IBANs, credit card numbers, and national IDs across dozens of countries.
Anonymization vs Pseudonymization
This is the most critical distinction in GDPR data protection. The regulation treats anonymized and pseudonymized data fundamentally differently:
| Aspect | Anonymization (Replace) | Pseudonymization (Encrypt) |
|---|---|---|
| GDPR Article | Recital 26 | Article 4(5) |
| Definition | Data no longer relates to an identifiable person | Personal data processed so it cannot be attributed without additional info |
| GDPR scope | Data exits GDPR scope entirely | Data remains in GDPR scope |
| Reversibility | Irreversible — PII permanently removed | Reversible with encryption key |
| anonym.plus operator | Replace, Redact, Mask, Hash | Encrypt |
| Best for | Public release, permanent redaction | Internal use, temporary redaction, audit trail |
The practical implication: if you use the Replace operator, the output document is no longer subject to GDPR obligations because the personal data has been irreversibly removed. If you use the Encrypt operator, the output document is pseudonymized — it still falls under GDPR, but the encryption provides a strong additional safeguard recognized by Articles 25 and 32.
How anonym.plus Satisfies GDPR Requirements
anonym.plus addresses multiple GDPR requirements through its architecture and features:
- Data minimization (Art. 5(1)(c)): Detect and remove unnecessary PII before sharing documents. Only the data strictly needed for the purpose is retained.
- Purpose limitation (Art. 5(1)(b)): Anonymize data that is no longer needed for its original collection purpose. Replace operator permanently removes PII that has served its purpose.
- Data protection by design (Art. 25): Built-in PII detection and anonymization as a core feature, not an afterthought. The entire pipeline is designed for privacy protection from the ground up.
- Right to erasure (Art. 17): The Replace operator permanently removes PII from documents, fulfilling erasure requests at the document level.
- Security of processing (Art. 32): AES-256-GCM encryption for the Encrypt operator, zero-knowledge architecture, and fully offline processing ensure the highest level of data security.
- Data protection impact assessment (Art. 35): Processing logs provide a local audit trail documenting what was anonymized, when, and with which settings — supporting DPIA documentation requirements.
Offline Processing — No Data Leaves Your Device
The entire PII detection and anonymization pipeline runs locally on your machine. This is a fundamental architectural decision that directly supports GDPR compliance:
- Presidio + spaCy are bundled as a sidecar process that runs entirely on your device. No API calls are made for PII detection or anonymization.
- Documents never leave your device. Text extraction, entity detection, anonymization, and export all happen locally. No document content is ever uploaded to any server.
- The anonym.plus server handles only: account management, licensing, and subscription billing. These functions involve only your email and license key — never your documents.
- The server cannot access: your documents, PII detection results, anonymized outputs, encryption keys, vault contents, or vault password. This is enforced by architecture, not by policy.
This offline-first architecture eliminates an entire category of GDPR risk: data transfer. Since personal data never leaves your device during processing, there are no cross-border data transfer concerns, no processor agreements needed for the anonymization step, and no risk of server-side data breaches affecting your documents.
Zero-Knowledge Architecture
anonym.plus uses a zero-knowledge architecture that ensures even the service provider cannot access your sensitive data:
- Password hashing: Your password is hashed client-side using Argon2id + HKDF + SHA-256 before any server communication. The server stores only cryptographic proofs — it never receives or stores your plaintext password.
- Server-side storage: The server stores only the cryptographic hash of your password, your email address, and license information. No document data, PII results, or encryption keys are ever transmitted to the server.
- Vault encryption: The encryption key vault is encrypted with AES-256-GCM. The vault encryption key is derived from your password via Argon2id. Without your password, the vault contents are cryptographically inaccessible.
- Memory security: Key material is zeroed from memory when the vault locks. No residual key data remains in RAM after lock or application close.
- Key isolation: Encryption keys are referenced by ID only in the frontend. Key values never leave the Rust backend process, preventing exposure through browser dev tools or JavaScript vulnerabilities.
Step-by-Step GDPR Compliance Workflow
Follow this workflow to anonymize documents in a GDPR-compliant manner:
- Identify documents containing personal data. Gather the documents that need anonymization — contracts, reports, correspondence, databases, or any files containing PII.
- Select the GDPR Compliance detection preset. This preset uses a confidence threshold of 0.90 (higher than the default 0.85) to reduce false negatives. Alternatively, create a custom preset targeting the specific entity types relevant to your data.
- For permanent anonymization: use the Replace operator. This irreversibly removes all detected PII. The output document exits GDPR scope entirely — it is no longer personal data under Recital 26.
- For temporary pseudonymization: use the Encrypt operator. This replaces PII with encrypted tokens that can be reversed with the encryption key. The output document remains in GDPR scope but gains the additional protections recognized by Articles 25 and 32.
- Review all detected entities before processing. The review step is critical for GDPR compliance. Verify that all personal data has been detected (no false negatives) and that non-PII has not been incorrectly flagged (no false positives that would unnecessarily alter the document).
- Save anonymized documents. Export the processed documents. The processing entry is saved to your local history for audit purposes, documenting which entities were detected, which operator was used, and when processing occurred.
- For encrypted documents: manage encryption keys securely. Store keys in the vault with a strong password. Record the 24-word BIP39 recovery phrase in a secure offline location. Track which key was used for which documents to ensure future decryption capability.
Ready to try it yourself? See anonymization in action →