BAA-Ready AI: What to Ask Vendors

"We support HIPAA" is a sentence vendors say. What it means in practice ranges from "we have a BAA template ready to send" to "we have never seen a healthcare customer and we hope it works out."

This is the list of questions we use during vendor evaluation for AI products that will touch PHI. The answers you get separate vendors who have done this work from vendors who think they can.

On the BAA itself

Will you sign your BAA, or ours? Most established health-tech vendors have a BAA they will sign. Some require theirs. A vendor who balks at signing any BAA is not a healthcare vendor.

What is the BAA's scope? Does it cover all of the vendor's services or just specific ones? Does it cover sub-processors? Does it carve out any data uses (analytics, model improvement) that a covered entity should refuse?

What sub-processors are in scope? Every third party in the data path needs BAA coverage. The LLM provider, the embedding endpoint, the vector database, the logging service. Get the list.

What is the breach notification timeline? HIPAA requires timely notification. Vendor BAAs commonly specify 24, 48, or 72 hours. Anything longer than that should be flagged.

On the data path

Where does PHI go from the moment we send it? The vendor should be able to draw a diagram of every system the data touches, with each labeled as "BAA-covered" or "not." If they cannot, they have not thought through the data path.

Where is data stored? AWS region, Azure region, GCP region. Multi-region is fine; "we don't know" is not. Some healthcare contracts require US-based storage; some require specific regions.

Where do models run? A model accessed via the vendor's API endpoint is not the same as a model deployed in your VPC. Both can be appropriate; the vendor should be clear which they offer.

What happens to PHI at rest? Encryption with which key? Customer-managed keys? Vendor-managed keys with attestation? The detail matters.

On training data

Is our data used to train models? Default at major LLM providers (Anthropic, OpenAI, AWS Bedrock) for enterprise tier is no. Confirm the contract reflects this. Verify the vendor's downstream use of your data is also no.

Does the vendor fine-tune models on customer data? If yes, what is the data lifecycle? If a customer leaves, can the contributions to fine-tuned models be removed? Often the answer is "not really," which is a deal-breaker for sensitive workloads.

Does the vendor use your data for anything other than serving you? "Quality improvement," "feature development," "product analytics" are all answers that should slow procurement down. Get specifics.

On audit and logging

What is logged for every model invocation? The right answer includes: requesting user identity, prompt, retrieved context, model output, tool calls, timestamps. "Aggregate analytics" is not the right answer for HIPAA work.

Can we access the audit log directly? Some vendors keep the AI audit log internal. For HIPAA, you need to be able to query it without a support ticket. API access, with rate limits and quotas, is the expectation.

How long are audit logs retained? Six years minimum for HIPAA. Some vendors keep less; some keep indefinitely. Both have implications.

Are audit logs encrypted at rest? With customer-managed keys? The audit log is PHI. Treat it that way.

On tenant isolation

How is our data isolated from other customers'? Index-per-tenant, namespace-per-tenant, or shared resources with metadata filters. Each has different security implications. Vendors who answer "logically separated" without specifics are flagging.

Can you demonstrate isolation? The vendor should be able to show the architecture, the access control layer, and the controls that enforce isolation. Marketing diagrams are not demonstration.

What happens if another customer experiences a breach? Worst case, what is your exposure? A vendor whose architecture has a single shared resource with weak isolation is one breach away from being yours.

On model behavior

Which model is being used? Specifics. "Claude Sonnet" is more useful than "an LLM." "Claude 3.5 Sonnet on AWS Bedrock" is the level of specificity that matters.

When does the model change? Vendors update underlying models. Will you be notified? Is there an evaluation step? Will behavior change without your knowledge?

Are the prompts proprietary? If the vendor will not show you the prompts, they will not be able to explain the model's behavior, and neither will you. For HIPAA work where every output may need to be defended, this matters.

What guardrails are in place? Content filters, PII detection, prompt-injection protection. The vendor should have answers.

On outputs

Are outputs cited? Every claim the model makes should reference the source documents that informed it. Free-form outputs without citations are not auditable.

What happens when retrieval comes back empty? The system tells the user it does not know. Or the system makes something up. The first is acceptable for healthcare; the second is not.

Is there a human-in-the-loop boundary? For any clinical decision, a clinician should review and approve. The vendor's product should support this. If the workflow is "AI takes action automatically," that is a flag for HIPAA workloads.

On customer support

Has the vendor passed a real procurement at a covered entity? Reference customers. Names. The vendor should be willing to put you in touch with at least one customer in healthcare.

What is the SLA for security incidents? AI systems break in novel ways. Hallucinations, prompt injection, retrieval failures. The vendor's response capability matters as much as their preventative posture.

Who is your security contact? A specific person. A specific email. A specific escalation path. Not "support@vendor.com."

Red flags

A few responses that should slow procurement down materially:

"We use the latest models" without specifying.
"Your data is secure" without explanation.
"We're working on HIPAA" — not the same as HIPAA today.
"We can sign a BAA but it covers fewer services than we offer" — the carve-outs are where the risk lives.
"Our model provider handles that" — they may, but the contract is between you and the vendor in front of you.
Vague answers to detailed technical questions, especially in front of a Security Officer.

Some red flags are disqualifying. Some indicate a vendor whose security posture is less mature than they realize, where the right move is a longer evaluation and a stricter SOW.

The deeper test

Beyond the checklist, the question that often separates ready vendors from unready ones: how the vendor talks about regulatory work.

Vendors who have done HIPAA work talk about it as a set of practical engineering and operational decisions. Encryption keys, audit log schemas, retention policies, sub-processor lists. The work is detailed but tractable.

Vendors who have not done HIPAA work tend to talk about it abstractly. "We're compliant." "We support HIPAA." "Our infrastructure is secure." The lack of specifics is the signal.

If you are working through this checklist and the vendor's answers are consistently abstract, that is the answer. Move on.

Where we fit

We do BAA-readiness reviews for healthcare clients evaluating AI vendors regularly. The conversation often produces a shorter shortlist than the original RFP, but the vendors that survive are the ones that will clear the rest of the procurement process. Get in touch if you have a vendor evaluation in front of you.