DynamoDB Access Patterns for High-Performance Applications
A practical guide to DynamoDB data modeling — covering single-table design, access pattern planning, GSIs, sparse indexes, and the patterns that prevent expensive rework.
DynamoDB is one of the most performant and scalable databases available on AWS. It is also one of the most expensive to retrofit when the data model is wrong. The reason is the same in both cases: DynamoDB is built around access patterns. You define the access patterns first, model the data to support them efficiently, and then query exactly as the model expects. Do this correctly and you get single-millisecond latency at any scale. Design it in reverse — start with the data model and figure out access patterns later — and you eventually hit a wall that requires either rebuilding the data model or replacing DynamoDB with a relational database.
This guide covers the patterns that matter, starting with the cardinal rule.
The Cardinal Rule: Access Patterns Before Data Model
In a relational database, you design a normalized schema and then write queries. The query layer is flexible — you can join tables, filter on any column, sort by arbitrary fields, and add indexes later if queries are slow.
In DynamoDB, the query layer is not flexible. You can only query by primary key and sort key. Secondary indexes (GSIs and LSIs) expand what you can query, but they are defined at table creation time and must be maintained. Ad-hoc queries that were not anticipated in the data model either require expensive scans or are impossible.
This means the design process is inverted. Before modeling any data, document every access pattern your application will use:
1. Get user by user_id
2. Get all orders for a user, sorted by date (descending)
3. Get all orders with status=PENDING across all users (admin use)
4. Get order by order_id
5. Get all items in an order
6. Get all orders containing a specific product_id
Write these down. All of them. Then design the data model to support every pattern with a primary key query or a GSI query. If a pattern cannot be supported this way, it either needs to be rethought or moved to a different data store.
Primary Key Design
A DynamoDB primary key is either a simple key (partition key only) or a composite key (partition key + sort key). Most production tables use composite keys.
Partition key (PK) — Determines which physical partition holds the item. DynamoDB distributes items across partitions based on the partition key hash. All queries must specify the partition key.
Sort key (SK) — Enables range queries within a partition. Items with the same partition key and different sort keys are stored together, sorted lexicographically. Sort key queries support begins_with, between, and comparison operators.
The most important property of a good partition key: high cardinality with even distribution. A partition key with low cardinality (e.g., a status field with three possible values) concentrates traffic on a small number of partitions, creating hot partitions that hit throughput limits. A partition key with high cardinality (e.g., user_id or order_id) distributes traffic evenly.
Single-Table Design
Single-table design is the dominant pattern in production DynamoDB systems. The idea: store all entity types in a single table, using the primary key structure to differentiate entities and support multiple access patterns.
This is counterintuitive to engineers with a relational background, where entities live in their own tables. The reason single-table works in DynamoDB is that DynamoDB queries are partition-scoped — items with the same partition key are stored and retrieved together efficiently. Storing related data under a single partition key, using the sort key to differentiate it, enables fetching an entity and its related data in a single query.
A Concrete Example
Consider an order management system with Users, Orders, and Order Items.
Table: OrdersTable
# User entity
PK: USER#user_123 SK: #METADATA#user_123
Attributes: name, email, created_at
# Order entity (under the user)
PK: USER#user_123 SK: ORDER#2026-02-17#order_456
Attributes: total, status, shipping_address
# Order item entity (under the order)
PK: ORDER#order_456 SK: ITEM#item_789
Attributes: product_id, quantity, unit_price
# Order entity (accessible by order_id directly)
PK: ORDER#order_456 SK: #METADATA#order_456
Attributes: user_id, total, status, created_at
This structure supports:
- Get user:
PK = USER#user_123, SK = #METADATA#user_123 - Get all orders for user:
PK = USER#user_123, SK begins_with ORDER# - Get orders for user in date range:
PK = USER#user_123, SK between ORDER#2026-01-01 and ORDER#2026-02-28 - Get order by ID:
PK = ORDER#order_456, SK = #METADATA#order_456 - Get all items in order:
PK = ORDER#order_456, SK begins_with ITEM#
This is the core value of single-table design: multiple access patterns served by a single table with no joins.
Item Collections
Items that share a partition key form an item collection. In a single-table design, a user's item collection might contain their profile, their orders, and their addresses — all stored under PK = USER#user_id.
Item collection size is limited to 10GB if a local secondary index exists on the table. For most applications, this limit is not reached, but if individual item collections can grow large (e.g., a user with millions of orders), design with this limit in mind.
Global Secondary Indexes
A Global Secondary Index (GSI) is a separate index with its own partition key and sort key, built from a subset of the table's attributes. GSIs enable access patterns that the base table's primary key does not support.
GSIs are eventually consistent by default (you can request strongly consistent reads from the base table, but not from GSIs). They add storage cost and write throughput cost — every write to the base table that affects a GSI attribute triggers a write to the GSI.
GSI Overloading
A single GSI can support multiple access patterns if you use the same overloading pattern as the base table. This is GSI overloading.
Example: You need to support two additional access patterns:
- Get all PENDING orders (across all users) — admin view
- Look up a user by email address
Rather than creating two GSIs, create one with GSI_PK and GSI_SK attributes:
# For the order entity, populate GSI attributes for status-based lookup
GSI_PK: STATUS#PENDING GSI_SK: ORDER#2026-02-17#order_456
# For the user entity, populate GSI attributes for email lookup
GSI_PK: EMAIL#user@example.com GSI_SK: #METADATA#user_123
Now a single GSI supports both patterns. The overloaded GSI pattern keeps the number of indexes minimal while supporting a wide range of access patterns.
Sparse Indexes
A sparse index is a GSI that only indexes a subset of items — specifically, only items that have the GSI's partition key attribute defined.
If the GSI_PK attribute is only populated on items with a specific status — say, PENDING orders — then the GSI only contains those items. Queries against the sparse index automatically filter to that subset without needing a filter expression.
# Only PENDING orders have this attribute set
PENDING_ORDER_GSI_PK: "PENDING" # Only set on orders with status=PENDING
# COMPLETED orders do not have this attribute at all
# → They are not in the GSI
Querying the GSI for PK = PENDING returns only pending orders, efficiently, without scanning completed orders. When an order is fulfilled and its status changes to COMPLETED, the attribute is removed, and the item is automatically removed from the GSI.
Sparse indexes are useful for any pattern that involves "get all X where Y is true" where Y is a state that applies to a minority of items.
Relationship Patterns
1:1 Relationships
Store as separate items sharing a partition key, or as attributes on a single item if the data is always accessed together and the total item size remains under 400KB.
1:N Relationships
Use the parent entity's ID as the partition key and the child entity's ID (or a sortable attribute) as the sort key. This supports fetching all children of a parent in a single query.
PK: ACCOUNT#account_123
SK: TRANSACTION#2026-02-17T14:23:11Z#txn_456
For large collections, where the parent may have millions of children, consider whether you actually need to fetch all children or only recent children. The sort key's range query capability is particularly useful here — SK begins_with TRANSACTION#2026-02 fetches only February transactions, for example.
M:N Relationships
Many-to-many relationships require explicit join items. If users can belong to many teams, and teams have many users:
# User → Teams lookup
PK: USER#user_123 SK: TEAM#team_456
Attributes: role, joined_at
# Team → Users lookup (duplicate item with swapped PK/SK for the GSI, or separate item)
PK: TEAM#team_456 SK: USER#user_123
Attributes: role, joined_at
This pattern duplicates the relationship data, which is the DynamoDB approach to supporting queries in both directions without joins.
Transactional Writes
DynamoDB supports ACID transactions across up to 100 items in a single TransactWriteItems call. This enables:
- Creating multiple related items atomically (e.g., creating an order and decrementing inventory in a single transaction)
- Conditional writes that fail if a precondition is not met (e.g., only create an item if an item with that key does not already exist)
- Consistent multi-item updates that should not be partially applied
Transactions in DynamoDB cost twice the write capacity of non-transactional writes (the overhead of the coordination mechanism). For operations that genuinely require atomicity, this is the right tool. For operations that do not, pay the lower cost of standard writes.
Optimistic Locking with Version Numbers
For concurrent update scenarios, DynamoDB's conditional write expressions enable optimistic locking without a separate locking mechanism:
# Write condition: only update if version_number matches expected value
ConditionExpression: "version_number = :expected_version"
UpdateExpression: "SET version_number = :new_version, ..."
If two processes try to update the same item concurrently, the second write fails the condition check. The second process then retries with the current state. This is the standard pattern for preventing lost updates in DynamoDB.
DynamoDB Streams for Event-Driven Patterns
DynamoDB Streams captures a time-ordered sequence of item-level changes (inserts, updates, deletes) and makes them available for downstream processing. This enables event-driven architectures without polling.
Common patterns built on DynamoDB Streams:
Derived data maintenance. When an order item is updated, a stream processor recalculates the order total and updates the parent order item. The parent is always consistent with its children without requiring the write path to do both updates.
Cross-region replication. Stream processors read changes from a primary region and replicate them to secondary regions. (AWS Global Tables is a managed version of this pattern.)
Audit logging. Every item change is captured in the stream and written to an audit log store. This is a clean separation between the application write path and the audit trail — the application writes to DynamoDB, the stream processor writes to the audit log, and neither path knows about the other.
Search index synchronization. Item changes in DynamoDB trigger an OpenSearch index update via a stream processor. The operational database and the search index stay in sync without coupling the write path to the search index write.
Stream records are available for 24 hours. Consumers must process them within that window or miss them. For audit and compliance use cases, ensure your stream consumer has adequate error handling and retry logic.
TTL for Automatic Expiration
DynamoDB's Time to Live (TTL) feature automatically deletes items when a specified timestamp attribute passes. TTL deletions are background operations — they do not consume write capacity and occur within approximately 48 hours of the TTL attribute's expiry time (not exactly at the specified time).
TTL is appropriate for:
- Session data — Session records that should expire after a fixed inactivity period
- Temporary state — Pending verifications, one-time tokens, in-progress operations with timeouts
- Caching — Items used as a DynamoDB cache layer, where stale data should be removed automatically
TTL deletions appear in DynamoDB Streams, which means TTL can be used to trigger downstream cleanup operations — deleting related items in other tables or updating derived data when the primary item expires.
For compliance use cases where data must be retained for a defined period and then deleted, TTL combined with a stream processor provides a clean deletion mechanism with a downstream audit log of the deletion event.
Capacity Planning: On-Demand vs. Provisioned
DynamoDB offers two billing modes:
On-demand — You pay per request. DynamoDB automatically scales to any traffic level with no configuration. No capacity planning required. Higher per-request cost than provisioned at sustained load.
Provisioned — You specify the read and write capacity units (RCUs and WCUs) the table should maintain. Lower per-unit cost than on-demand at predictable load. Requires capacity planning and auto-scaling configuration to handle traffic spikes.
For most applications, on-demand mode is the right default:
- No risk of throttling from under-provisioning
- No wasted spend from over-provisioning
- Zero capacity planning required
- Appropriate for variable or unpredictable traffic
Switch to provisioned mode when: you have a well-characterized, stable traffic pattern and the per-request cost difference justifies the operational overhead of capacity management. For high-throughput applications with millions of requests per day, the cost difference is significant.
Cost Optimization Patterns
Project attributes to reduce item size. DynamoDB bills on the size of the items read and written. Large items cost more per operation. Storing large blobs in S3 and storing the S3 reference in DynamoDB reduces item size and read/write cost.
Use batch operations. BatchGetItem and BatchWriteItem reduce per-request overhead for bulk operations. TransactWriteItems is more expensive than batch writes for non-transactional operations — use it only when atomicity is actually required.
Prefer eventual consistency where possible. Eventual consistency reads cost half the RCUs of strongly consistent reads. For read patterns that do not require seeing the most recent write (e.g., displaying a list that updates periodically), eventual consistency is appropriate and less expensive.
Archive infrequently accessed data. Items that are rarely accessed but must be retained (e.g., historical order records, archived documents) can be moved to S3 and accessed via Athena, reducing DynamoDB storage costs.
When NOT to Use DynamoDB
DynamoDB is the wrong choice when:
Your access patterns are not known upfront. If you need ad-hoc queries across arbitrary fields — analytical queries, exploratory data access, complex filtering — DynamoDB will frustrate you. Use a relational database (Aurora PostgreSQL) or a purpose-built analytics store (Redshift, Athena over S3).
You need complex transactions across many items. DynamoDB's 100-item transaction limit and lack of multi-table join capability make it unsuitable for systems with complex relational constraints — financial ledgers with multi-table consistency requirements, inventory systems with cascading updates across many entities.
Your data model is highly relational and frequently changing. Single-table DynamoDB models for complex domains with many entity types and many access patterns are difficult to design correctly and difficult to evolve. If the access patterns are genuinely unpredictable, the flexibility of a relational database is worth the scaling trade-off.
You need full-text search or fuzzy matching. DynamoDB supports exact key lookups and range queries. Full-text search, stemming, and fuzzy matching require OpenSearch or a purpose-built search service.
DynamoDB is excellent for: user and session data, time-series event data, leaderboards and rankings, operational data stores with well-defined access patterns, and high-throughput write workloads. It is not a general-purpose replacement for relational databases.
Building the Right Model Before You Build the System
The upfront investment in DynamoDB access pattern analysis pays back multifold. An hour spent writing out every access pattern before touching the data model prevents weeks of rework when the application is in production and a new access pattern requires restructuring the table.
If you are building an application on AWS and working through the data architecture — whether that is DynamoDB, Aurora, or a hybrid — an architecture review covers this territory. Our cloud architecture practice and AI development work both involve DynamoDB design as a regular part of system design engagements.
The goal is always the same: model the data correctly the first time, so the system does not need to be rebuilt when it scales.