[AWS] Kinesis Data Firehose

Kinesis Firehose is a fully managed service that loads streaming data into data stores (S3) and analytics tools (Redshift, Elasticsearch, or Splunk), enabling near real-time analytics with existing business intelligence tools.

Amazon Kinesis Data Firehose

Features

  • Kinesis has shards to store data for some time (data persistence), but Firehose does not save the data.
  • Firehose can batch, compress, transform, and encrypt the data before loading it.
    • For example, you can automatically convert the incoming data to columnar formats like Apache Parquet and Apache ORC, before the data is delivered to other data sources like S3.
  • Firehose can optionally invoke an AWS Lambda function to transform incoming data before delivering it to destinations. But Lambda functions can not be a destination. The destinations are storage/analytic services.

Use cases

Kinesis Firehose is used when:

  • collecting streaming data and delivering to the destination quickly
  • processing is optional, and data retention is not important
  • e.g.) capturing data from IoT devices and stream into a data lake

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s