Amazon Kinesis Firehose is a managed service to load real-time streaming data to Amazon Simple Storage Service (S3), Redshift or Elastic search Service (ES). Firehose is part of the Amazon Kinesis streaming data platform, along with Amazon Kinesis Streams and Amazon Kinesis Analytics. There is no need to write applications or manage resources with Firehose. Applications can easily be configured to send data to Firehose and it automatically loads the data on the destination specified.
It enables near real-time analytics with existing business intelligence tools and dashboards. It automatically scales to match the throughput of your data and requires no ongoing administration. It can even compress and encrypt the data before loading, thus, minimizing the amount of storage used at the destination and increasing security.
Firehose delivery stream can easily be created and configured from the AWS Management Console, and data sending to the stream can be started in just a few minutes.
With Amazon Kinesis Firehose, you only pay for the amount of data you transmit through the service with no minimum fee or setup cost required.
Key Concepts and Terminology:
Following terms helps in understanding and using the Amazon Firehose.
- Firehose delivery stream
Users creates a Firehose delivery stream in order to send data to it.
- Record
A Record is the data of interest submitted by the user to the delivery stream. It can be up to 1000KB in size.
- Data producers
Data producers are the applications that generate streaming data. For example, A web application generating log data and sending to Delivery Stream, a web crawler sending crawled data etc.
- Buffer Size and Buffer Interval
Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering to destinations. Buffer Size is in MBs and Buffer Interval is in seconds.
- Amazon Kinesis Agents: It is a Java application for the linux-based servers, that monitors files such as log files and continuously collect and send data to your delivery stream.
The example usage of the configurations for the Amazon S3, Redshift and Elastic Search with the argument reference is available here.
Reblogged this on SQL Tutorials.
LikeLike