Publication dataset anonymization is the removal of participant identifiers so a file falls under GDPR Recital 26. Once nobody can be identified, it leaves GDPR scope. anonym.plus runs offline and keeps the variables open and usable.
When this applies
A journal asks for the underlying file as a supplement. It must name no participant, since open release is permanent and public.
How anonym.plus handles it
- Load the file (CSV, XLSX, or DOCX) into anonym.plus on your device.
- The tool scans columns and notes for any direct identifier.
- Local OCR reads a scanned survey sheet if you add one.
- Confirm the flagged names, emails, and rare dates.
- Replace each with a stable token, or drop the field.
- Save the open-ready file locally with no upload.
What you need to provide
- The file (CSV, XLSX, DOCX, or scan).
- An operator: Replace for tokens, Redact to drop a column.
- Optional: a token map (do not publish it).
PHI entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | Priya Nair → [P_017] |
| EMAIL_ADDRESS | p.nair@example.org → [EMAIL] | |
| Dates | DATE_TIME | survey 03/03/2026 → [WEEK] |
| Location | LOCATION | Kochi, India → [REGION] |
| Age | AGE | age 97 → [AGE_BAND] |
| Free-text ID | ID | resp R-204 → [RESP_ID] |
Compliance achieved
- Aims for GDPR Recital 26 scope, where data is no longer personal.
- Runs offline, so the tool itself needs no BAA.
- On-device AES-256-GCM protects the working copy.
- Aligns with Art. 9 duties before the file becomes anonymous.
Anonymize publication datasets offline — see plans & start free →
Limitations & cautions
Open release is final, so the bar is high. Recital 26 means no one can be identified by any means reasonably likely to be used. The tool removes direct identifiers; you must also generalise rare values and drop the token map before you publish.
Frequently asked questions
What does GDPR Recital 26 say?
It states that truly anonymous data, where no one can be identified, is outside the GDPR. A published supplement should reach that bar, since open release cannot be undone.
Is removing names enough for open data?
Often not. A rare age or a small region can still single someone out. You remove direct identifiers and then generalise rare values.
Should I publish the token map?
No. If the map is public, the data can be re-linked and is no longer anonymous. Keep any map private, or discard it entirely.