Kinesis Firehose is a fully managed service that loads streaming data into data stores (S3) and analytics tools (Redshift, Elasticsearch, or Splunk), enabling near real-time analytics with existing business intelligence tools.
Amazon Kinesis Data Firehose
- Kinesis has shards to store data for some time (data persistence), but Firehose does not save the data.
- Firehose can batch, compress, transform, and encrypt the data before loading it.
- For example, you can automatically convert the incoming data to columnar formats like Apache Parquet and Apache ORC, before the data is delivered to other data sources like S3.
- Firehose can optionally invoke an AWS Lambda function to transform incoming data before delivering it to destinations. But Lambda functions can not be a destination. The destinations are storage/analytic services.
Kinesis Firehose is used when:
- collecting streaming data and delivering to the destination quickly
- processing is optional, and data retention is not important
- e.g.) capturing data from IoT devices and stream into a data lake