Vector Database Selection for HIPAA Workloads

When you build a RAG system that touches PHI, the vector database is in the data path. That means it has to be on your BAA, encrypted at rest with controls you can demonstrate, isolated by tenant, and capable of producing audit logs that are useful to a Security Officer.

This guide covers the options we have shipped to production and the ones we have considered and ruled out, with the trade-offs that matter in HIPAA-aligned work.

What a HIPAA-aligned vector store needs

Before comparing products, the requirements:

BAA coverage. Either the vector database is BAA-covered itself, or it runs inside a covered service (AWS, Azure, GCP under their respective BAAs) on infrastructure you control.
Encryption at rest with customer-managed keys. AWS KMS, Azure Key Vault, or GCP Cloud KMS, with the ability to rotate keys and demonstrate which key encrypted which records.
Encryption in transit. TLS 1.2+, with no exceptions for internal traffic.
Tenant isolation. Either separate indexes, namespaces, or hard metadata filters that you can prove cannot leak across boundaries.
Audit logging. Every query, write, and admin action with a user identity, timestamp, and result. CloudTrail or equivalent.
Access control. IAM-style permissions, not just a shared API key.
Right-to-deletion. Specific records can be deleted on demand, and the deletion is verifiable.

Anything that does not satisfy all seven gets ruled out for PHI workloads.

Options we ship to production

Amazon Bedrock Knowledge Bases

For AWS-native HIPAA workloads, this is the default. Knowledge Bases is a managed RAG layer that ingests from S3, generates embeddings, stores them in OpenSearch Serverless or Aurora PostgreSQL with pgvector, and serves retrieval-augmented queries — all under the AWS BAA.

The trade-off: less control over chunking, embedding model selection, and retrieval logic than rolling your own. For workloads where the standard configuration is good enough, the operational simplicity is a real win.

OpenSearch (managed) with kNN

OpenSearch Service on AWS is HIPAA-eligible. The k-NN plugin handles vector search natively, and OpenSearch's text search is mature, so hybrid retrieval (vector + BM25) works well.

Use this when you need full control over indexing, custom retrieval logic, and you are already operating other OpenSearch workloads.

Aurora PostgreSQL with pgvector

For teams already running Postgres, pgvector is the lowest-friction path. Aurora is HIPAA-eligible. Encryption, IAM, audit logging, and access control come from the database layer you already operate.

The trade-off: pgvector is fast enough for indexes up to a few million vectors, but tuning it for larger workloads requires care. Index choice (HNSW vs IVFFlat), embedding dimensionality, and query patterns all matter.

Azure AI Search

For Microsoft-centric workloads, Azure AI Search is the equivalent of OpenSearch with vector support, integrated with Azure OpenAI for embeddings. HIPAA-covered under Microsoft's BAA.

Options we have considered and ruled out for PHI

Pinecone

Pinecone is fast, well-engineered, and the team behind it knows what they are doing. For workloads that do not involve PHI, it is often the right answer.

For HIPAA, the question is whether your Pinecone deployment is BAA-covered. Pinecone offers HIPAA compliance through specific AWS-hosted deployments under a signed BAA, but the deployment options and pricing are different from the standard product. The lift to verify BAA scope, keep deployments inside the covered tier, and document the data path adds operational overhead. For most clients we work with, the BAA-covered AWS-native options are simpler.

If your team has specific needs Pinecone serves better — multi-region replication, very large indexes, specific performance characteristics — and you are willing to do the BAA verification work, it remains a viable option.

Weaviate, Qdrant, Milvus (self-hosted)

These are excellent open-source vector databases. The catch for HIPAA is that you operate them yourself. Encryption at rest with customer-managed keys, audit logging, access control, backup and recovery, patching, and incident response are all your responsibility.

We use these for non-PHI workloads where the operational burden is justified by performance or cost. For PHI workloads, the engineering and ongoing operations cost rarely justifies the savings over a managed BAA-covered option.

Chroma, FAISS

These are libraries, not databases. They are the right tool for prototyping or for embedded use cases where the index is small and lives in your application. They are not a fit for production PHI workloads — there is no audit log, no access control, no operational story.

Decision framework

AWS-native and you want minimal ops? Bedrock Knowledge Bases.
AWS-native and you need full retrieval control? OpenSearch with kNN.
Postgres-centric team? Aurora PostgreSQL with pgvector.
Microsoft-centric? Azure AI Search.
Have specific Pinecone-only needs and willing to do BAA verification? Pinecone on the covered tier.
Self-hosting open-source for PHI? Probably not. The total cost of ownership is higher than it looks.

What "tenant isolation" really means

Whichever option you pick, tenant isolation has to be verifiable. The two patterns we use:

Index-per-tenant. Each customer (or each matter, or each clinic) gets its own index. Cross-tenant queries are physically impossible.
Single index with hard metadata filters. All vectors share an index, but every query includes a tenant filter. The filter is enforced at the application layer before the vector search runs.

Index-per-tenant is more expensive but easier to defend. Single index with filters is cheaper but requires meticulous code review — a single missed filter is a tenant breach. We default to index-per-tenant for healthcare and legal work.

Audit logging requirements

For each retrieval, log:

The requesting user identity (carried from the application layer)
The query (or its embedding hash, if the query itself is PHI)
The retrieved chunk IDs
The matter / case / patient context
The timestamp and latency

These logs are PHI. Treat them like any other PHI store: encrypted, access-controlled, retained per HIPAA's six-year minimum, and never exported to systems lacking BAA coverage.

Where to start

If you are in early architecture for a HIPAA RAG workload, the cheapest move is to default to Bedrock Knowledge Bases or Aurora pgvector and only deviate when you have specific evidence the default does not work. The temptation to over-engineer the vector layer is strong; resist it. The retrieval quality problems you will hit are almost always about chunking, embeddings, and re-ranking — not the vector database itself.