SDTM de-identification is the removal of subject keys and stray text identifiers from the file. It supports CDISC SDTM submission structure. anonym.plus works offline and leaves the domain variables in place.
When this applies
An analyst shares a domain export with a partner who has no right to see who the subjects are. The USUBJID and comment fields must be masked.
How anonym.plus handles it
- Open the export (XPT, CSV, or XLSX) in anonym.plus on your machine.
- The tool reads each domain and flags subject keys and comments.
- Local OCR handles any scanned annotation page you add.
- Check the flagged USUBJID values and free-text notes.
- Mask the key, or swap it for a stable pseudonym across domains.
- Save the result locally. The source export never leaves your device.
What you need to provide
- The export (XPT, CSV, XLSX, or scan).
- An operator: Mask for partial keys, Replace for pseudonyms.
- Optional: a key map so one subject maps to one pseudonym everywhere.
PHI entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Subject key | ID | USUBJID 01-701-1015 → [PSEUDO_1] |
| Names in comments | PERSON | "called daughter Mei" → [PERSON] |
| Dates | DATE_TIME | RFSTDTC 2026-01-09 → [DATE] |
| Site | LOCATION | Site 701, Boston → [SITE] |
| Contact | PHONE_NUMBER | 617-555-0188 → [PHONE] |
| Investigator | PERSON | Dr. Patel → [INVESTIGATOR] |
Compliance achieved
- Keeps the structure expected by CDISC SDTM submissions.
- Runs offline, so the tool itself needs no BAA.
- On-device AES-256-GCM protects the working copy.
- Supports GDPR Art. 9 health data for EU subjects.
Anonymize SDTM datasets offline — see plans & start free →
Limitations & cautions
Masking USUBJID can break a join if you need to merge domains later. Keep a separate key map for that, held apart from the shared copy. Free-text comment variables are the main place stray names hide; review those closely.
Frequently asked questions
What is USUBJID in CDISC SDTM?
It is the unique subject identifier that links a person across every domain. On its own it is an indirect identifier. Masking or pseudonymising it stops a partner from tracing a single subject across the export.
Where do stray names usually appear?
In free-text comment variables and supplemental domains. People type a relative's name or a town. The tool scans these fields and flags such text.
Can I keep the domains joinable?
Yes, if you apply one stable pseudonym per subject. Keep the original key map in a separate, protected place so only you can re-link.