[AWS] DynamoDB

DynamoDB is a highly resilient (low-latency) NoSQL database service that replicates data across multiple AZs. DynamoDB can replace existing No SQL databases such as MongoDB, Cassandra DB, or Oracle NoSQL.


Features

  • DynamoDB is resilient in a region.
    • DynamoDB spreads data at least across 3 geographically distinct data centers (AZs).
  • Performance
    • Data are stored on SSD.
  • DynamoDB is a serverless, fully managed NoSQL database service designed for Online Transactional Processing (OLTP) workloads.
    • stores data in JSON-like, name-value documents.
    • Flexible Schema
    • scales horizontally
    • supports event-driven programming

Components

  • TABLE: a collection of items that share the same Partition Key (PK) or PK and SK (Sort Key)
  • ITEM: a collection of attributes (up to 400KB) inside the table
  • ATTRIBUTE: key and value

DynamoDB supports an automatic backup with optional point-in-time recovery (35 days) and a manual snapshot. No performance impact during backups.

  • You can encrypt data with a provided key or a KMS key.

DynamoDB has a partition key (PK) and a sort key (SK). There are 2 types of primary key in the table.

Primary Keys

  • Partition Key
    • Represents a unique attribute such as an ID or email address.
    • Each item is stored separately
  • Composite Key (Partition Key + Sort Key)
    • Items in the table might have the same partition key, but they have the different sort key
    • All items with the same partition key are stored together and then sorted according to the sort key.

How do you use the keys?

  • The partition key determines the logical partitions in which a table’s data is stored and affects the underlying physical partitions.
    • Provisioned I/O capacity of the table was divided evenly among these physical partitions.
  • The throughput of a table depends on the partition key design as well. To improve the performance, you need to provide more distinct partition keys. With more partition keys, the queries will be spread across the partitioned space.

Reading and Writing Data

Dynamo DB writes and reads an item as a whole.

Basic Write Operations

  • PutItem: write item to the specific primary key
  • UpdateItem: change attributes for an item with the specified primary key
  • BatchWriteItem: write bunch of items to the specified primary keys
  • DeleteItem: remove the item with the specified primary key

Basic Read Operations

  • GetItem: retrieve item whith the specific primary key
  • BatchGetItem: retrieve items with the specified primary keys
  • Scan: retrieve all items in the table
  • Query: retrieve items matching the sort key expression for the specified partition key

Scan

  • Most flexible option.
  • You can apply a filter with any attribute.
    • You can use a scan operation without a filter. This will get all items.
    • When a filter is used, a scan operation reads all items and applies a filter. So, it consumes a lot of capacity units.

Query

  • With Query, you can retrieve only data you want.
  • Only PK and SK can be used as query conditions.
    • You can retrieve items of the partition key.
    • Query is fully indexed and very efficient.
    • And refine the query with the sort key with conditions
  • You can apply filters after querying items.

Eventually Consistent Reads (default)

  • Data can be read from any node. Consistency spreads all nodes within 1 second.
  • The data received may not reflect a recent write.
  • The best practice is to use eventually consistent read wherever possible.

Strongly Consistent Reads

  • Data is retrieved from a leader node.
  • The most up-to-date copy of data are returned.

Capacity

DynamoDB reserves the necessary resources to handle your throughput requirements and divides the throughput evenly among partitions.

Read Capacity Units (RCU)

  • 1 rcu = 4 KB read per second (strongly consistent)
  • 1 rcu = 2*4 = 8 KB read per second (eventually consistent)

Write Capacity Units (WCU):

  • 1 wcu = 1 KB write per second
  • Atomic transaction requires 2* wcu to complete. (Prepare + Commit)

Rad/Write Capacity Modes

  • Provisioned throughput (default): Each table is configured with RCU and WCU.
  • On-demand: The capacity automatically scales, and you are charged a per-request charge. It can be used when the usage is not predictable. No minimum capacity.

The following DynamoDB features are charged:

  • Read and Write Capacity
  • Storage of data
  • Cross-region data transfer (no charge for a single region data transfer)

DynamoDB Streams

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.Tutorial.html

When enabled, DynamoDB Streams captures a time-ordered sequence of item-level modifications (insert, update, and delete) in a DynamoDB table and stores the information for up to 24 hours.

You can create a trigger system (event-driven architecture) with DynamoDB streams and Lambda functions.

DynamoDB Streams supports the following stream record views:

  • KEYS_ONLY: only the key attributes of the modified item
  • NEW_IMAGE: The entire item, as it appears after it was modified
  • OLD_IMAGE: The entire item, as it appears before it was modified
  • NEW_AND_OLD_IMAGES: Both the new and the old images of the item

DynamoDB Global Tables

Global tables are managed multi-master, multi-region replication based on DynamoDB streams.

  • Enable Streams, start with an empty table, and add a region.
  • Multi-region redundancy for high availability.


A global table is used for globally distributed applications for multi-region redundancy.

Replication latency is under one second.


DynamoDB Indexes

Indexes provide an alternative representation of data for varying query demands. You can retrieve data using a Query from the index. A table can have multiple secondary indexes to many different query patterns.

Local Secondary Index (LSI)

  • LSI is an alternative view and must be created at the same time as creating a table.
  • LSI uses the same PK but an alternative SK. The SK consists of exactly one scalar attribute.
  • LSI shares the performance with the main table. (RCU and WCU)
  • You can create up to 5 LSIs per table.

Global Secondary Index (GSI)

  • GSI can be created at any point after the table is created.
  • GSI uses different PK and SK to allow efficient query operations.
  • GSI is based on asynchronously replicated data and has their own RCU and WCU.
  • It cannot use the strongly consistent read. Only the eventually consistent read is supported.
  • You can create up to 20 GSIs per table.

Time To Live (TTL)

You can set the expiry time for the data in the DynamoDB table.

  • Expired items are marked for deletion.
  • Good for removing old data such as sessions, event logs, or temporary data. You need to save the TTL attribute in each item and enable TTL using the attribute.

The TTL value should be

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s