Real-World Evidence Dataset Anonymisation with anonym.plus

Clean an EHR-derived file on your own machine before any analysis or sharing.

Real-world evidence anonymisation is the removal of patient identifiers from data drawn out of routine care. It supports UK GDPR Art. 89 research safeguards. anonym.plus runs offline and keeps the clinical signals usable.

When this applies

A team pulls a cohort from electronic records to study outcomes. The extract still carries names, full birth dates, and clinic codes.

How anonym.plus handles it

  1. Load the extract (CSV, XLSX, or DOCX) into anonym.plus.
  2. The tool scans structured fields and free-text notes.
  3. Local OCR reads any scanned chart page you attach.
  4. Confirm the flagged names, dates, and clinic identifiers.
  5. Replace each with a steady pseudonym across the file.
  6. Save the cleaned cohort locally with no upload.

What you need to provide

Patient data entity types detected

Categoryanonym.plus entity typeExample
NamesPERSONHannah Watkins → [PATIENT_5]
Birth dateDATE_TIMEborn 19/02/1947 → [BIRTH_YEAR]
ClinicORGANIZATIONHolborn GP Practice → [PROVIDER]
LocationLOCATIONLondon EC1 → [REGION]
NHS numberMEDICAL_RECORD_NUMBERNHS 485 777 3310 → [NHS_NO]
ContactPHONE_NUMBER+44 20 7946 0151 → [PHONE]

Compliance achieved

Anonymise real-world evidence datasets offline — see plans & start free →

Limitations & cautions

Routine-care extracts are rich, so quasi-identifiers stack up fast. The tool removes direct identifiers and flags rare birth dates. A rare diagnosis with a small region can still re-identify someone, so test the combinations before you share.

Frequently asked questions

What is real-world evidence?

It is evidence about care and outcomes drawn from routine sources like electronic records or claims, not a controlled trial. Such extracts hold rich personal data that must be cleaned under UK GDPR Art. 89 safeguards.

Why are these files higher risk?

They carry many fields per person, so quasi-identifiers combine easily. Removing names is not enough; you must judge rare value combinations.

Does the clinical signal survive?

Yes. Diagnoses, drugs, and outcomes stay. Only direct identifiers are swapped, and you generalise rare values where needed.