What personal data should I remove before sending a document to an AI?

Remove: full names, email addresses, phone numbers, postal addresses, national IDs (passport, SSN, Aadhaar, etc.), IBANs and financial identifiers, dates of birth, medical diagnoses and record numbers, IP addresses, and any other information that could identify a specific person. anonym.plus detects 340+ entity types automatically.

Does anonym.plus upload my documents anywhere during processing?

No. anonym.plus runs the entire detection and anonymization pipeline locally on your machine. The Microsoft Presidio NLP engine and spaCy models are bundled in the installer. No document content is transmitted to any server. Processing works fully offline.

How to Anonymize Documents Before Sharing with ChatGPT or...

Q: Is it safe to upload documents to ChatGPT?

Not without removing personal data first. ChatGPT, Claude, and other LLMs may use uploaded content to improve their models and are subject to US data laws, not GDPR. Under GDPR, uploading personal data to a US cloud service without appropriate safeguards constitutes a restricted data transfer. Always anonymize documents before uploading.

Q: Can I use encrypted placeholders so I can see the AI's answer in context?

Yes. Use the Encrypt operator in anonym.plus instead of Replace. This replaces each PII entity with a unique encrypted token (e.g. «ENC:a3f9...»). You send the encrypted document to the AI. When you receive the response, paste it back into anonym.plus Deanonymize — it automatically replaces encrypted tokens with the original values.

By George Curta · Published March 17, 2026 · 6 min read · Text guide · File guide

Every time you upload a contract, medical record, HR file, or business report to ChatGPT or Claude, you are transferring personal data to a US-based cloud service. Under GDPR, this may constitute an unlawful data transfer. The fix is simple: anonymize the document first, locally, before any AI ever sees it.

Why Uploading PII to AI Tools is a GDPR Risk

ChatGPT (OpenAI), Claude (Anthropic), and Gemini (Google) are US-based cloud services. When you upload a document containing personal data — names, email addresses, medical records, contracts with client information — you are:

Transferring personal data to a third country (USA) without GDPR-compliant safeguards in most cases
Acting as a data controller who bears full responsibility for that transfer under GDPR Art. 44–49
Potentially contributing to AI training — some services use uploaded content to improve models, depending on privacy settings
Violating professional secrecy obligations — lawyers, doctors, HR professionals, and accountants face additional duties beyond GDPR

The solution is not to stop using AI tools — it is to strip all personal data before the AI ever sees the document. Anonymized content is not personal data under GDPR Recital 26, so it can be freely shared with cloud services.

What Personal Data to Remove

Before sending a document to an AI assistant, remove all of the following:

Category	Examples	anonym.plus Entity Type
Names	John Smith, Dr. María García	PERSON
Contact details	john@company.com, +49 30 1234567	EMAIL_ADDRESS, PHONE_NUMBER
Addresses	123 High Street, Berlin 10115	LOCATION, STREET_ADDRESS
National IDs	SSN, passport numbers, Aadhaar, NHS	US_SSN, PASSPORT, IN_AADHAAR, UK_NHS
Financial identifiers	IBAN DE89 3704 0044, Visa 4111...	IBAN_CODE, CREDIT_CARD
Health data	Diagnosis codes, medical record numbers	MEDICAL_LICENSE, US_ITIN
Dates (context-dependent)	Date of birth, admission date	DATE_TIME
IP addresses	192.168.1.45, 2001:db8::1	IP_ADDRESS

anonym.plus detects all of these and 340+ more entity types automatically. You review the results before any data is modified.

Replace vs Encrypt: Which Mode to Use

anonym.plus offers two anonymization approaches for AI workflows:

Replace (one-way anonymization)

Each PII entity is replaced with a generic label: [PERSON_1], [EMAIL_1], [IBAN_1]. The AI sees a clean document with placeholders. The original values are permanently removed from the output file — this produces true anonymization under GDPR Recital 26.

Best for: Legal research, compliance drafting, summarization, translation, content review — any task where the AI does not need to know actual personal values.

Encrypt (reversible pseudonymization)

Each PII entity is replaced with a unique AES-256-GCM ciphertext token. The document looks anonymized to the AI but can be fully restored later using your stored encryption key. This enables the encrypt-share-edit-decrypt workflow: you send the encrypted document to an AI or colleague, receive their edited version back, then deanonymize in one click.

Best for: Contract review, document editing, medical coding, any task where the AI's output must reference the original personal values.

Step-by-Step Workflow

Open anonym.plus and go to the Files tab (for documents) or Text tab (for pasted text).
Drop your file — PDF, DOCX, XLSX, TXT, CSV, or image. All processing is local; nothing is uploaded.
Select a detection preset — use the GDPR Compliance preset (confidence 0.90) or configure entity types manually.
Choose Replace or Encrypt depending on whether you need the AI's response to reference specific values.
Review detected entities — correct false positives and confirm all PII is flagged.
Click Anonymize and export the clean output file.
Upload the anonymized file to the AI — Claude, ChatGPT, Gemini, Copilot, or any other tool.
(If Encrypt) Deanonymize the AI response — paste the AI's response into anonym.plus Deanonymize tab. The app auto-matches encrypted tokens and restores original values.

See the full workflow in action. Try the live demo →

The Legal Basis: Why Anonymized Data is Safe to Share

GDPR Recital 26 states: "The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person."

Properly anonymized documents — where re-identification is effectively impossible — fall entirely outside GDPR's scope. You can share them with any service in any country without data transfer restrictions, without consent, and without a Data Processing Agreement.

For Replace-mode outputs, anonym.plus produces true anonymization. For Encrypt-mode outputs, the output is pseudonymized (GDPR still applies, but the AI service receives pseudonymized rather than clearly personal data). Always use Replace for external AI services where you do not need the values restored.

Frequently Asked Questions

Is it safe to upload documents to ChatGPT?

Not without removing personal data first. ChatGPT is operated by a US entity and uploading EU personal data without appropriate safeguards may violate GDPR Art. 44. Always anonymize with Replace mode before uploading to any cloud AI service.

Can I use encrypted placeholders so I can see the AI's answer in context?

Yes. Use the Encrypt operator. It replaces PII with ciphertext tokens the AI ignores, but you can deanonymize the response in one click to see answers in the context of original names and values.

Does anonym.plus upload documents anywhere?

No. The entire detection and anonymization pipeline runs on your device. The Presidio NLP engine and spaCy models are bundled locally. No document content is ever sent to any server.

What document formats does anonym.plus support for AI pre-processing?

PDF (50 MB), DOCX (30 MB), XLSX (20 MB), TXT (50 MB), CSV, JSON, XML (30 MB), and images (PNG, JPG, BMP, TIFF — 10 MB). Structure is preserved in all output formats.

Limitations to Know

OCR accuracy on images: Text extraction from scanned documents and images is highly accurate for printed text (95%+). Handwritten content, low-resolution scans, or unusual fonts may have lower recall — manual review is recommended for image-heavy documents.
Context-dependent detection: Ambiguous tokens (e.g., "Jordan" as a country vs. a person name) may be missed or over-flagged depending on surrounding context. Review confidence scores and add custom regex patterns for domain-specific identifiers not covered by the default model.
Not a substitute for legal advice: Anonymization reduces re-identification risk and brings documents outside GDPR scope under Recital 26 — but it does not eliminate all legal risk for highly sensitive or regulated datasets. Consult your DPO for high-stakes use cases.

How to Anonymize Documents Before Sharing with ChatGPT or Claude