Batch anonymisation removes personal data from a whole folder of letters in one run under UK GDPR Art. 9 and DPA 2018. anonym.plus works on your device, with a shared label map so one patient maps to one alias everywhere.
When this applies
A teaching corpus or research set means cleaning hundreds of letters at once, with the same patient or doctor in many. One-by-one work risks drift. A batch run keeps it even.
How anonym.plus handles it
- Point anonym.plus at the folder on your machine.
- It scans each letter for the full set of direct and indirect identifiers.
- A shared map keeps repeat people steady across the run.
- Review the summary and fix any low-confidence flags.
- Save the clean corpus on your device.
What you need to provide
- A folder of letters (PDF, DOCX, or mixed).
- The shared label map turned on for steady results.
- An operator (Replace works well for a corpus).
Patient data entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | patient across files → [PATIENT_1] |
| Names | PERSON | repeat doctor → [PROVIDER_1] |
| Dates | DATE_TIME | letter dates → [DATE] |
| Record IDs | MEDICAL_RECORD_NUMBER | MRNs → [MRN_n] |
| Location | LOCATION | addresses → [ADDRESS] |
| Contact | EMAIL_ADDRESS | emails → [EMAIL] |
Compliance achieved
- Applies UK GDPR Art. 9 & DPA 2018 the same way to all files.
- The shared map keeps results steady across the run.
- Whole-folder work stays offline — nothing uploaded.
Anonymise clinical letters offline — see plans & start free →
Limitations & cautions
Mixed folders (some scanned, some native) lean on OCR for the image files, so review low-confidence flags from scans. A shared map keeps results even, but you must guard that map, since it can re-link the set if kept.
Frequently asked questions
How does batch mode keep one patient steady?
A shared map logs each person once, so the same patient or doctor maps to the same alias in every file across the folder.
Can the batch mix PDFs, DOCX, and scans?
Yes. Mixed types work. Scanned letters are read with local OCR before the check.
Is there a cloud quota on size?
No. Work is local, so speed depends on your hardware, not a cloud plan.