Redshift is a petabyte-scale data warehousing solution (Business Intelligence).
- Column based database; Data can be loaded from S3.
- Used for OLAP (OnLine Analytics Processing) workloads (retrieval and analytics).
- Not for transactional-based processing, which RDS can support.
- Not for unstructured data, which NoSQL can support.
- MPP (Massively Parallel Processing)
- Single Node
- Multi Nodes: Leader node + Compute node
- Kinesis can inject data into Redshift. Redshift is used for analyzing summarized data from other services such as Kinesis or RDS.
- Charges: Compute Node Hours (no charge for leader node hours), Backups, Data Transfer inside VPC
- Security: SSL in transit, AES-256 encryption at rest
- Backups: You can create snapshots in inter-AZs and even inter-regions.
Redshift Spectrum is a feature of Amazon Redshift.
- Spectrum queries and analyzes data in S3 using the open data formats you already use, with no data loading or transformations (ETL).
Redshift Enhanced VPC Routing
With Enhanced VPC Routing enabled, Redshift forces all COPY and UNLOAD traffic between the cluster and data repositories through the Amazon VPC. If Enhanced VPC Routing is not enabled, Redshift routes traffic through the internet.
By using Enhanced VPC Routing, you can use standard VPC features, such as security groups, network access control lists (ACLs), VPC endpoints, internet gateways, and Domain Name System (DNS) servers.
You can also monitor COPY and UNLOAD traffic using VPC Flow Logs.