Glossary
PHI (Protected Health Information)
Any health information that can be linked to an individual — names, dates, addresses, medical record numbers, biometric identifiers, and 18 specific identifier types under HIPAA.
PHI is the regulatory category that triggers HIPAA's safeguards. The definition is broader than most engineers expect: it covers not just diagnoses and treatments, but any of the 18 HIPAA identifiers (names, geographic subdivisions smaller than a state, dates more specific than a year, phone numbers, emails, SSNs, MRNs, account numbers, certificate numbers, vehicle IDs, device IDs, URLs, IP addresses, biometric identifiers, full-face photos, and any other unique identifier) when linked to health information.
The key practical points: an IP address logged alongside a clinical note is PHI. A cookie ID linked to a patient portal session is PHI. Free-text fields where users might paste anything are presumed to contain PHI. Once data is PHI, it stays PHI until it has been formally de-identified using one of HIPAA's two methods.
Most AI projects underestimate where PHI lives. A "health analytics" pipeline that ingests appointment timestamps and provider IDs is handling PHI. An LLM prompt log that captures user input is a PHI store. Designing the system to acknowledge this from day one is much cheaper than retrofitting compliance later.
Related terms
HIPAA-Aligned AI
AI systems designed so that protected health information (PHI) flows only through HIPAA-eligible services, with audit logging, access controls, and BAA coverage end-to-end.
BAA (Business Associate Agreement)
A contract under HIPAA between a covered entity and any third party that handles PHI on its behalf, defining each party's responsibilities for protecting that data.