Real-world evidence anonymization is the removal of patient identifiers from data drawn out of routine care. It supports GDPR Art. 89 research safeguards. anonym.plus runs offline and keeps the clinical signals usable.
When this applies
A team pulls a cohort from electronic records to study outcomes. The extract still carries names, full birth dates, and clinic codes.
How anonym.plus handles it
- Load the extract (CSV, XLSX, or DOCX) into anonym.plus.
- The tool scans structured fields and free-text notes.
- Local OCR reads any scanned chart page you attach.
- Confirm the flagged names, dates, and clinic identifiers.
- Replace each with a steady pseudonym across the file.
- Save the cleaned cohort locally with no upload.
What you need to provide
- The extract (CSV, XLSX, DOCX, or scan).
- An operator: Replace for pseudonyms, Redact to drop a field.
- Optional: a pseudonym map held apart for re-linking.
PHI entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | Hannah Weiss → [PATIENT_5] |
| Birth date | DATE_TIME | born 19/02/1947 → [BIRTH_YEAR] |
| Clinic | ORGANIZATION | Mitte GP Practice → [PROVIDER] |
| Location | LOCATION | Berlin 10115 → [REGION] |
| Local ID | MEDICAL_RECORD_NUMBER | chart 60112 → [CHART_ID] |
| Contact | PHONE_NUMBER | +49 30 555 0151 → [PHONE] |
Compliance achieved
- Supports the safeguards expected under GDPR Art. 89.
- Runs offline, so the tool itself needs no BAA.
- On-device AES-256-GCM protects the working copy.
- Reaches GDPR Recital 26 scope once no one can be identified.
Anonymize real-world evidence datasets offline — see plans & start free →
Limitations & cautions
Routine-care extracts are rich, so quasi-identifiers stack up fast. The tool removes direct identifiers and flags rare birth dates. A rare diagnosis with a small region can still re-identify someone, so test the combinations before you share.
Frequently asked questions
What is real-world evidence?
It is evidence about care and outcomes drawn from routine sources like electronic records or claims, not a controlled trial. Such extracts hold rich personal data that must be cleaned under Art. 89 safeguards.
Why are these files higher risk?
They carry many fields per person, so quasi-identifiers combine easily. Removing names is not enough; you must judge rare value combinations.
Does the clinical signal survive?
Yes. Diagnoses, drugs, and outcomes stay. Only direct identifiers are swapped, and you generalise rare values where needed.