[AWS] DynamoDB

DynamoDB is a highly resilient NoSQL database service that replicates data across multiple AZs. DynamoDB can replace existing No SQL databases such as MongoDB, Cassandra DB, or Oracle NoSQL.

  • DynamoDB is resilient in a region.
    • Data are stored on SSD and spread at least across 3 distinct data centers (AZs).
  • DynamoDB stored data in JSON-like, name-value documents.

Components

  • TABLE: a collection of items that share the same Partition Key (PK) or PK and SK (Sort Key)
  • ITEM: a collection of attributes (up to 400KB) inside the table
  • ATTRIBUTE: key and value

DynamoDB supports an automatic backup with optional point-in-time recovery (35 days) and a manual snapshot. No performance impact during backups.

  • You can encrypt data with a provided key or a KMS key.

DynamoDB has a partition key (PK) and a sort key (SK).

  • The partition key portion of a table’s primary key determines the logical partitions in which a table’s data is stored and affects the underlying physical petitions. Provisioned I/O capacity of the table was divided evenly among these physical partitions.
  • The throughput of a table depends on the partition key design as well. To improve the performance, you need to provide more distinct partition keys. With more partition keys, the queries will be spread across the partitioned space.

Reading and Writing Data

Dynamo DB writes and reads an item as a whole.

Scan

  • Most flexible option. You can apply a filter with any attribute.
  • You can use a scan operation without a filter. This will get all items.
  • When a filter is used, a scan operation reads all items and applies a filter. So, it consumes a lot of capacity units.

Query

  • With Query, you can retrieve only data you want.
  • Only PK and SK can be used as a filter.

Eventually Consistent Reads (default)

  • Data can be read from any node. Consistency spreads all nodes within 1 second.
  • The data received may not reflect a recent write.

Strongly Consistent Reads

  • Data is retrieved from a leader node.
  • The most up-to-date copy of data are returned.

Capacity

Read Capacity Units (RCU)

  • 4 KB each per second (strongly consistent)
  • 2*4 = 8 KB (eventually consistent)

Write Capacity Units (WCU):

  • 1 KB per second
  • Atomic transaction requires 2X WCU to complete. (Prepare + Commit)

Rad/Write Capacity Modes

  • Provisioned throughput (default): Each table is configured with RCU and WCU.
  • On-demand: The capacity automatically scales, and you are charged a per-request charge. It can be used when the usage is not predictable. No minimum capacity.

The following DynamoDB features are charged:

  • Read and Write Capacity
  • Storage of data
  • Cross-region data transfer (no charge for a single region data transfer)

DynamoDB Streams

When enabled, DynamoDB Streams captures a time-ordered sequence of item-level modifications (insert, update, and delete) in a DynamoDB table and stores the information for up to 24 hours.

You can create a trigger system (event-driven architecture) with DynamoDB streams and Lambda functions.

DynamoDB Streams supports the following stream record views:

  • KEYS_ONLY: only the key attributes of the modified item
  • NEW_IMAGE: The entire item, as it appears after it was modified
  • OLD_IMAGE: The entire item, as it appears before it was modified
  • NEW_AND_OLD_IMAGES: Both the new and the old images of the item

DynamoDB Global Tables

Global tables are managed multi-master, multi-region replication based on DynamoDB streams.

  • Enable Streams, start with an empty table, and add a region.
  • Multi-region redundancy for high availability.


A global table is used for globally distributed applications for multi-region redundancy.

Replication latency is under one second.


DynamoDB Indexes

Indexes provide an alternative representation of data for varying query demands. You can retrieve data using a Query from the index. A table can have multiple secondary indexes to many different query patterns.

Local Secondary Index (LSI)

  • LSI is an alternative view and must be created at the same time as creating a table.
  • LSI uses the same PK but an alternative SK. The SK consists of exactly one scalar attribute.
  • LSI shares the performance with the main table. (RCU and WCU)
  • You can create up to 5 LSIs per table.

Global Secondary Index (GSI)

  • GSI can be created at any point after the table is created.
  • GSI uses different PK and SK to allow efficient query operations.
  • GSI is based on asynchronously replicated data and has their own RCU and WCU.
  • It cannot use the strongly consistent read. Only the eventually consistent read is supported.
  • You can create up to 20 GSIs per table.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s