Real-world evidence anonymisation is the removal of patient identifiers from data drawn out of routine care. It supports UK GDPR Art. 89 research safeguards. anonym.plus runs offline and keeps the clinical signals usable.
When this applies
A team pulls a cohort from electronic records to study outcomes. The extract still carries names, full birth dates, and clinic codes.
How anonym.plus handles it
- Load the extract (CSV, XLSX, or DOCX) into anonym.plus.
- The tool scans structured fields and free-text notes.
- Local OCR reads any scanned chart page you attach.
- Confirm the flagged names, dates, and clinic identifiers.
- Replace each with a steady pseudonym across the file.
- Save the cleaned cohort locally with no upload.
What you need to provide
- The extract (CSV, XLSX, DOCX, or scan).
- An operator: Replace for pseudonyms, Redact to drop a field.
- Optional: a pseudonym map held apart for re-linking.
Patient data entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | Hannah Watkins → [PATIENT_5] |
| Birth date | DATE_TIME | born 19/02/1947 → [BIRTH_YEAR] |
| Clinic | ORGANIZATION | Holborn GP Practice → [PROVIDER] |
| Location | LOCATION | London EC1 → [REGION] |
| NHS number | MEDICAL_RECORD_NUMBER | NHS 485 777 3310 → [NHS_NO] |
| Contact | PHONE_NUMBER | +44 20 7946 0151 → [PHONE] |
Compliance achieved
- Supports the safeguards expected under UK GDPR Art. 89.
- Runs offline, so no cloud data-processor contract is triggered.
- On-device AES-256-GCM protects the working copy.
- Reaches UK GDPR Recital 26 scope once no one can be identified.
Anonymise real-world evidence datasets offline — see plans & start free →
Limitations & cautions
Routine-care extracts are rich, so quasi-identifiers stack up fast. The tool removes direct identifiers and flags rare birth dates. A rare diagnosis with a small region can still re-identify someone, so test the combinations before you share.
Frequently asked questions
What is real-world evidence?
It is evidence about care and outcomes drawn from routine sources like electronic records or claims, not a controlled trial. Such extracts hold rich personal data that must be cleaned under UK GDPR Art. 89 safeguards.
Why are these files higher risk?
They carry many fields per person, so quasi-identifiers combine easily. Removing names is not enough; you must judge rare value combinations.
Does the clinical signal survive?
Yes. Diagnoses, drugs, and outcomes stay. Only direct identifiers are swapped, and you generalise rare values where needed.