Batch de-identification is the removal of patient identifiers from many files at once. UK GDPR Art. 9 and the DPA 2018 govern health data. anonym.plus runs it on your device. Each file keeps its meaning, but none names a person.
When this applies
You ready a research cohort of hundreds of studies. The findings can stay, yet every name, date, and ID across the set has to be hidden first.
How anonym.plus handles it
- Point anonym.plus at a folder on your own machine.
- Local OCR reads any scanned page or stamped frame.
- The tool flags names, dates, and IDs across the set.
- Review the flags once and set your rule.
- Apply that rule to every file in one pass.
- Save the clean folder. The source set stays with you.
What you need to provide
- A folder of files (DICOM, PDF, image, or mixed).
- An operator: Replace (swap), Redact (remove), or Mask (partial).
- Optional: a shared name map for the whole cohort.
Patient data entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | Various patients → [PATIENT_n] |
| Dates | DATE_TIME | All study dates → [DATE] |
| Record IDs | MEDICAL_RECORD_NUMBER | MRN list → [MRN] |
| Identifiers | ID | Accession list → [ACCESSION] |
| Site | ORGANIZATION | Source NHS trusts → [SITE] |
| Contact | PHONE_NUMBER | Contact lines → [PHONE] |
Compliance achieved
- Anonymisation under UK GDPR Art. 9 & DPA 2018.
- Once truly anonymous, each file leaves UK GDPR scope under Recital 26.
- Assessed against the ICO motivated-intruder test.
- Working files are kept safe with AES-256-GCM.
Anonymise imaging datasets offline — see plans & start free →
Limitations & cautions
A batch run applies one rule to many files. Sample the output to confirm the rule fit every layout. Odd templates may need a second pass.
Frequently asked questions
Can one name map cover the whole cohort?
Yes. A shared map gives each patient one stable token across every file. The cohort stays linkable with no real name shown.
Does the batch handle mixed file types?
Yes. A folder can hold header files, exported pages, and frames. Local OCR reads the scans, so each type is covered in one run.
How do I trust a large run?
Sample the output. Open a handful of files and confirm the IDs are gone. A spot check catches any layout the rule missed.