Dataset anonymization is the removal of the 18 HIPAA Safe Harbor IDs (45 CFR §164.514(b)) from a remote-monitoring feed. anonym.plus runs offline and leaves the blood-pressure, glucose, and weight series intact for analysis.
When this applies
Home monitors stream readings tagged with the patient and the device serial. For population work or model training, those tags must come off.
How anonym.plus handles it
- Point anonym.plus at the export on your server.
- It scans ID columns and any free-text notes.
- Hardware serials and patient keys both get flagged.
- Steady labels keep one patient linked across rows.
- Review the summary, then save the clean set.
What you need to provide
- The export (CSV, JSON, or a record bundle).
- A column map for patient and device fields.
- Replace with a steady label map for joins.
PHI entity types detected
| Category | anonym.plus entity type | Example |
|---|---|---|
| Names | PERSON | patient_name → [PATIENT_n] |
| Identifiers | ID | monitor serial 9F-2207 → [DEVICE] |
| Dates | DATE_TIME | reading_ts → shifted [TIME] |
| Network | IP_ADDRESS | gateway IP → [IP] |
| Record IDs | MEDICAL_RECORD_NUMBER | mrn field → [MRN] |
| Location | LOCATION | home zip → [PLACE] |
Compliance achieved
- Strips all 18 ID classes for HIPAA Safe Harbor (45 CFR §164.514(b)).
- Hardware serials and gateway IPs are caught as unique IDs.
- Fully offline — no cloud exposure of home-monitoring streams.
Anonymize remote monitoring datasets offline — see plans & start free →
Limitations & cautions
Hardware serials are unique IDs and must go. But a rare reading pattern at a known site can still narrow identity after the obvious tags are gone. Shift the timestamps and use Expert Determination for small cohorts.
Frequently asked questions
Is a hardware serial really an identifier?
Yes. Safe Harbor lists equipment IDs and serials among the 18. A monitor serial maps to one patient, so it is flagged and swapped.
Can rows stay linkable after the swap?
Yes. A steady label map gives each patient one alias, so their readings still join while no real identity remains.
Why work locally on these feeds?
Sending raw monitoring data to a cloud tool is a disclosure with breach risk. On-device work skips that step entirely.