DynamoDB is a highly resilient (low-latency) NoSQL database service that replicates data across multiple AZs. DynamoDB can replace existing No SQL databases such as MongoDB, Cassandra DB, or Oracle NoSQL.
Features
- Supports both key-value and document data models
- Resilient in a region.
- DynamoDB spreads data at least across 3 geographically distinct data centers (AZs).
- Consistent responsiveness
- Performance
- SSD Storage.
- Single-digit millisecond latency
- Unlimited throughput and storage
- DynamoDB is a serverless, fully managed NoSQL database service designed for Online Transactional Processing (OLTP) workloads.
- stores data in JSON-like, name-value documents.
- Flexible Schema
- automatically scales horizontally
- supports event-driven programming
- Security
- Encryption at rest using the AWS-owned CMK by default.
Components
- TABLE
- A collection of items that share the same Partition Key (PK) or PK and SK (Sort Key)
- The name is case-sensitive.
- ARN: “arn:aws:dynamodb:{region}:{account}:table/{tableName}”
- “arn:aws:dynamodb:us-east-1:925307459448:table/Music“
- ITEM
- A collection of attributes (up to 400KB) inside the table
- ATTRIBUTE
- A key and value pair
- Attributes can be nested
Partition
- One partition holds up to 10 GB of data.
- It supports 3,000 RCUs (Read Capaity Units) and 1,000 WCUs (Write Capacity Units).
- A Partition Key is used for partition selection via internal hash function.
DynamoDB has a partition key (PK) and a sort key (SK). There are 2 types of primary key in the table.
Primary Keys
- Partition Key
- Represents a unique attribute such as an ID or email address.
- Each item is stored separately.
- The partition key is used as an input to the internal hash function, which returns the partition (physical location) on which data is stored.
- Composite Key (Partition Key + Sort Key)
- Items in the table might have the same partition key, but they have the different sort key.
- All items with the same partition key are stored together and then sorted according to the sort key.
How do you use the keys?
- The partition key determines the logical partitions in which a table’s data is stored and affects the underlying physical partitions.
- Provisioned I/O capacity of the table was divided evenly among these physical partitions.
- The throughput of a table depends on the partition key design as well. To improve the performance, you need to provide more distinct partition keys. With more partition keys, the queries will be spread across the partitioned space.
Reading and Writing Data
Dynamo DB writes and reads an item as a whole.
Basic Write Operations
- PutItem: write a new item to the specific primary key, or replace an existing item with the same key (last-write win)
- UpdateItem: change attributes for an item with the specified primary key
- BatchWriteItem: write bunch of items to the specified primary keys
- DeleteItem: remove the item with the specified primary key
Basic Read Operations
- GetItem: retrieve item whith the specific primary key
- BatchGetItem: retrieve items with the specified primary keys
- Scan: retrieve all items in the table
- Query: retrieve items matching the sort key expression for the specified partition key
Scan
- Most flexible option.
- You can apply a filter with any attribute.
- You can use a scan operation without a filter. This will get all items.
- When a filter is used, a scan operation reads all items and applies a filter. So, it consumes a lot of capacity units.
Query
- With Query, you can retrieve only data you want.
- Only PK and SK can be used as query conditions.
- You can retrieve items of the partition key.
- Query is fully indexed and very efficient.
- And refine the query with the sort key with conditions
- You can apply filters after querying items.
Eventually Consistent Reads (default)
- Data can be read from any node. Consistency spreads all nodes within 1 second.
- The data received may not reflect a recent write.
- The best practice is to use eventually consistent read wherever possible.
Strongly Consistent Reads
- Data is retrieved from a leader node.
- The most up-to-date copy of data are returned.
DynamoDB Transactions
DynamoDB transactions allow you to insert, update, delete items as a single logical operation.
- Provides ACID (Atomicity, Consistency, Isolation, and Durability)
Capacity
DynamoDB reserves the necessary resources to handle your throughput requirements and divides the throughput evenly among partitions.
Read Capacity Units (RCU)
- One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size.
- 1 rcu = 4 KB read per second (strongly consistent)
- 1 rcu = 2*4 = 8 KB read per second (eventually consistent)
- A transactional read requires 2 RCUs per 4KB data.
- (example 1) Item size: 5 KB
- 5/4 and round up => 2 consistent read
- Eventually consistent read => 1 RCU (2/2)
- Strongly consistent read => 2 RCUs
- (example 2) Item size: 9 KB
- 9/4 and round up => 3 consistent read
- Eventually consistent read => 2 RCUs (3/2, round up)
- Strongly consistent read => 3 RCUs
- (example 3) Item size: 3 KB, You need 50 eventually consistent reads per second. How many RCUs do you need?
- 3 / 4 -> round up to 1 => 1 consistent read
- 1 * 50 => 50 consistent reads
- 50 / 2 => 25 RCUs
Write Capacity Units (WCU):
- 1 wcu = 1 KB write per second
- Atomic transaction requires 2* wcu to complete. (Prepare + Commit)
- (example 1) Item size: 5KB
- Standard write => 5 WCU
- Transactional write => 10 WCU (5*2)
- (example 2) Item size: 600 byte, You need write 100 items per second? How many WCUs do you need?
- 0.6/1 -> round up => 1 standard write
- 1 * 100 = 100 standard writes
- 100 WCU
Rad/Write Capacity Modes
- Provisioned throughput (default)
- Specify expected read and write throughput requirements.
- Each table is configured with RCU and WCU.
- Cheaper per request than on-demand mode
- Subject to throttling due to under-provision.
- You can setup auto-scaling even in the provisioned mode.
- Use Cases
- With predictable and consistent traffic
- Specify expected read and write throughput requirements.
- On-demand
- The capacity automatically scales
- You are charged by per-request.
- Use Cases
- When the usage is not predictable. No minimum capacity.
- You can switch the mode only once every 24 hours.
The following DynamoDB features are charged:
- Read and Write Capacity
- Storage of data
- Cross-region data transfer (no charge for a single region data transfer)
DynamoDB Streams
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.Tutorial.html
When enabled, DynamoDB Streams captures a time-ordered sequence of item-level modifications (insert, update, and delete) in a DynamoDB table and stores the information for up to 24 hours.
You can create a trigger system (event-driven architecture) with DynamoDB streams and Lambda functions.
DynamoDB Streams supports the following stream record views:
- KEYS_ONLY: only the key attributes of the modified item
- NEW_IMAGE: The entire item, as it appears after it was modified
- OLD_IMAGE: The entire item, as it appears before it was modified
- NEW_AND_OLD_IMAGES: Both the new and the old images of the item
DynamoDB Global Tables
Global tables are managed multi-master, multi-region replication based on DynamoDB streams.
- Enable Streams, start with an empty table, and add a region.
- Multi-region redundancy for high availability.
A global table is used for globally distributed applications for multi-region redundancy.
Replication latency is under one second.
DynamoDB Indexes
Indexes provide an alternative representation of data for varying query demands. You can retrieve data using a Query from the index. A table can have multiple secondary indexes to many different query patterns.
Local Secondary Index (LSI)
- LSI is an alternative view and must be created at the same time as creating a table.
- LSI uses the same PK but an alternative SK. The SK consists of exactly one scalar attribute.
- LSI shares the performance with the main table. (RCU and WCU)
- You can create up to 5 LSIs per table.
Global Secondary Index (GSI)
- GSI can be created at any point after the table is created.
- GSI uses different PK and SK to allow efficient query operations.
- GSI is based on asynchronously replicated data and has their own RCU and WCU.
- It cannot use the strongly consistent read. Only the eventually consistent read is supported.
- You can create up to 20 GSIs per table.
Time To Live (TTL)
You can set the expiry time for the data in the DynamoDB table.
- Expired items are marked for deletion.
- After being marked, the item will be deleted within 48 hours.
- Good for removing old data such as sessions, event logs, or temporary data.
- You need to save the TTL attribute in each item and enable TTL using the attribute.
The TTL value should be
- a Number (N) data type
- a timestamp in Unix epoc time format
- Use the converter to get the value: https://www.epochconverter.com/
Backups
Point-in-time recovery (PITR)
- PITR provides continuous backups of your data for 35 days.
- Incremental backups
- It helps you protect against accidental write or delete operations.
ON-demand Backups
- Full backups at any time
- Zero impact on the table performance
Error retries and exponential backoff
- ProvisionedThroughputExceededException
- Your request rate is too high for your provisioned RCUs/WCUs.
When you get an server-side exception or ProvisionedThroughputExceededException
occasionally, you can implement the retry logic.
- In addition to simple retries, each AWS SDK implements an exponential backoff algorithm for better flow control.
- The exponential backoff uses progressively longer waits between retries for consecutive error responses.
- For example, up to 50 milliseconds before the first retry, up to 100 milliseconds before the second, up to 200 milliseconds before third, and so on.
If the problem is consistent, you need to consider increasing the read/write capacity.
Fine-grained Access Control with IAM
IAM Condition parameter dynamodb:LeadingKeys allows users to access only the items where the partition key matches their user id.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAccessToOnlyItemsMatchingUserID",
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem"
],
"Resource": [
"arn:aws:dynamodb:us-west-1:123456789012:table/Orders"
],
"Condition": {
"ForAllValues:StringEquals": {
"dynamodb:LeadingKeys": [
"${www.amazon.com:user_id}"
],
"dynamodb:Attributes": [
"UserId",
"Item",
"Price"
]
},
"StringEqualsIfExists": {
"dynamodb:Select": "SPECIFIC_ATTRIBUTES"
}
}
}
]
}