Academic Publication Dataset Anonymization with anonym.plus

Prepare a supplementary file for open release that names no participant.

Publication dataset anonymization is the removal of participant identifiers so a file falls under GDPR Recital 26. Once nobody can be identified, it leaves GDPR scope. anonym.plus runs offline and keeps the variables open and usable.

When this applies

A journal asks for the underlying file as a supplement. It must name no participant, since open release is permanent and public.

How anonym.plus handles it

  1. Load the file (CSV, XLSX, or DOCX) into anonym.plus on your device.
  2. The tool scans columns and notes for any direct identifier.
  3. Local OCR reads a scanned survey sheet if you add one.
  4. Confirm the flagged names, emails, and rare dates.
  5. Replace each with a stable token, or drop the field.
  6. Save the open-ready file locally with no upload.

What you need to provide

PHI entity types detected

Categoryanonym.plus entity typeExample
NamesPERSONPriya Nair → [P_017]
EmailEMAIL_ADDRESSp.nair@example.org → [EMAIL]
DatesDATE_TIMEsurvey 03/03/2026 → [WEEK]
LocationLOCATIONKochi, India → [REGION]
AgeAGEage 97 → [AGE_BAND]
Free-text IDIDresp R-204 → [RESP_ID]

Compliance achieved

Anonymize publication datasets offline — see plans & start free →

Limitations & cautions

Open release is final, so the bar is high. Recital 26 means no one can be identified by any means reasonably likely to be used. The tool removes direct identifiers; you must also generalise rare values and drop the token map before you publish.

Frequently asked questions

What does GDPR Recital 26 say?

It states that truly anonymous data, where no one can be identified, is outside the GDPR. A published supplement should reach that bar, since open release cannot be undone.

Is removing names enough for open data?

Often not. A rare age or a small region can still single someone out. You remove direct identifiers and then generalise rare values.

Should I publish the token map?

No. If the map is public, the data can be re-linked and is no longer anonymous. Keep any map private, or discard it entirely.