Multi-Tenant SaaS on AWS Amplify Gen 2

AWS Amplify Gen 2 is a sensible default for most B2B SaaS we ship. The framework gives you Cognito-based auth, AppSync GraphQL, DynamoDB-backed data, S3 storage, and Lambda functions in a TypeScript-first project structure. For multi-tenant SaaS — especially in regulated industries — the framework is enough infrastructure to get you started, with the right hooks to extend where you need to.

This guide covers the patterns we use for tenancy, isolation, and operations on Amplify Gen 2. It assumes you have done the basic setup and are now thinking about how to make the system work for multiple customers without leakage.

What "multi-tenant" actually means

Three architectures get called "multi-tenant SaaS." They are different.

Pool model. All tenants share infrastructure. A tenant_id column on every row. Isolation is enforced in software.
Bridge model. Some resources are shared (the Cognito user pool, the application code), others are per-tenant (databases, S3 buckets, KMS keys).
Silo model. Each tenant gets its own stack — its own Cognito pool, its own DynamoDB tables, its own everything. Isolation is enforced at the AWS account or region level.

Pool is cheapest. Silo is most defensible. Bridge is the compromise most regulated SaaS converges on. Amplify Gen 2 supports all three; the design choice is yours, not the framework's.

For HIPAA workloads with sensitive data, we usually ship a bridge model: shared user pool, shared application code, but per-tenant data partitions, S3 bucket prefixes, KMS keys, and audit logs. The decision depends on customer expectations more than technical fit.

Auth and tenancy

Amplify Gen 2 uses Cognito User Pools. For multi-tenant SaaS, the questions are:

One user pool, with tenant_id as a custom attribute. Single sign-on per identity, simpler to operate.
Multiple user pools, one per tenant. Stronger isolation, harder to operate at scale, often preferred by enterprise buyers.

We default to one pool with custom attributes. When a tenant's procurement requires a dedicated pool — not uncommon in healthcare — we move that tenant to a separate pool while keeping the rest in the shared one. This is not symmetric; it is a deliberate trade.

The tenant attribute has to be unforgeable at the application layer. In Amplify Gen 2, that means the tenant ID is set during user creation by the admin flow, not editable by the user, and read in every API call from the JWT — never from the request body.

// schema.ts
const schema = a.schema({
  Patient: a
    .model({
      mrn: a.string().required(),
      name: a.string().required(),
      tenantId: a.string().required(),
    })
    .authorization((allow) => [
      allow.ownerDefinedIn("tenantId").to(["read", "create", "update"]),
    ]),
})

The ownerDefinedIn("tenantId") rule binds the row to the JWT's tenant claim. Cross-tenant reads are physically impossible at the AppSync layer — not because the application code prevents them, but because the resolver does.

Data isolation

For most workloads, AppSync's authorization rules are sufficient. The query engine enforces the tenant filter. There is no application-level path that can return another tenant's row.

For PHI specifically, we add belt-and-suspenders:

Per-tenant DynamoDB partition keys. The PK is TENANT#<id>#PATIENT#<mrn>. Even if authorization were misconfigured, a query has to know the tenant's ID to find the row.
S3 prefix isolation. Files are stored at s3://td-app/<tenant_id>/<rest>. IAM policies restrict each Lambda role to its tenant's prefix where possible. Cross-tenant Lambdas (admin, audit) have explicit, logged exceptions.
KMS keys per tenant. For high-value tenants, the KMS key is per-tenant. Encryption at rest is enforced by the key, not just the storage configuration.

The cost: more KMS keys, more S3 prefixes, slightly more complex IAM. The benefit: a misconfigured authorization rule does not produce a tenant breach. The cryptographic boundary stops it first.

Custom Lambda paths

Amplify Gen 2 makes Lambda functions a first-class object in the schema. For business logic that goes beyond CRUD, you write a function and bind it to a query or mutation.

For multi-tenant work, the patterns:

Read the tenant ID from the JWT at the top of the function. Never trust a tenant ID in the request body.
Pass the tenant ID through to every downstream call — DynamoDB queries, S3 operations, Bedrock invocations, third-party APIs.
Log the tenant ID with every log line. CloudWatch Logs Insights queries that filter by tenant become trivial.

export const handler: Schema["createPriorAuthRequest"]["functionHandler"] = async (
  event
) => {
  const tenantId = event.identity.claims["custom:tenantId"]
  if (!tenantId) throw new Error("missing tenant claim")
 
  const patient = await getPatient(tenantId, event.arguments.mrn)
  if (!patient) throw new Error("not found")
 
  // ...
}

RAG and AI workloads

When the SaaS includes an AI feature — RAG over a tenant's documents, an agent that operates on tenant data — tenancy has to extend to the AI layer.

The pattern we use on Amplify Gen 2:

Vector index per tenant. Each tenant gets its own Bedrock Knowledge Base, OpenSearch index, or pgvector table. Cross-tenant retrieval is impossible.
Tenant ID injected into every Bedrock call. As metadata in the model invocation, captured in CloudTrail.
Audit log scoped to tenant. Every AI interaction in the audit log carries the tenant ID. Tenant-scoped queries to the audit log are first-class.

Amplify's defineFunction and defineStorage make per-tenant resources straightforward to provision. The catch is the operations: more resources to monitor, more cost lines, more lifecycle to manage. We invest in automation here from day one — no manual provisioning of per-tenant resources, ever.

Audit and compliance

For HIPAA SaaS on Amplify Gen 2:

Enable CloudTrail at the management-event level, with data events for the S3 buckets and DynamoDB tables that hold PHI.
Enable Config rules for the standard set: encryption at rest, public access blocked, MFA required.
Use Security Hub or a third-party CSPM for ongoing posture monitoring.
Send application logs to CloudWatch with structured logging (JSON, with tenant ID, user ID, request ID).
Set up an audit log pipeline as described in Audit Logging for AI Agents for any AI interactions.

Amplify Gen 2 itself is HIPAA-eligible under the AWS BAA. The eligibility is on the underlying services (AppSync, Lambda, DynamoDB, S3, Cognito), not the framework branding. Verify each service in your data path is on the BAA list.

Operations and rollout

Per-tenant resources require automation. The patterns we use:

Tenant onboarding as a step function. Provisions per-tenant DynamoDB partition, S3 prefix, KMS key, vector index, audit log path. Idempotent. Logged.
Tenant offboarding as a step function. Inverse of onboarding, with a hold period before destructive operations to allow contract dispute resolution.
Per-environment promotion. Separate dev / staging / prod stacks, with the same tenancy model. Tenant data does not move between environments.
Per-tenant deploys are not a thing. Application code is the same for all tenants; configuration is per-tenant.

When Amplify Gen 2 stops being the right tool

For a small number of high-value tenants whose security posture demands hard isolation — separate AWS accounts, separate VPCs, separate everything — Amplify Gen 2 is not the natural fit. AWS Organizations + a CDK stack per tenant is more work but produces a stronger isolation story.

The line we draw: bridge model on Amplify Gen 2 for the long tail of tenants. Silo model on a CDK stack for the handful of tenants whose contracts require it. The same application code runs in both.

Where to start

If you are at the beginning of a multi-tenant SaaS project, the design choices to make first:

Pool, bridge, or silo? Tied to your customer expectations and your compliance posture, not to your technology stack.
One Cognito pool or many? Affects user management, SSO, and admin tooling.
What is your audit log strategy? Has to be solid before you have customers.
What is your tenant onboarding automation? Has to be solid before you have customers.

The technology — Amplify Gen 2 vs. raw CDK vs. something else — is downstream of these decisions. We have shipped both Amplify Gen 2 and CDK-native architectures for similar problems. The framework matters less than the discipline.