AWS S3 (Simple Storage Service) is a global object storage platform that can be used to store and retrieve any amount of data.
S3
- S3 is object-based: key-value pair.
- Key: the name of an object
- Value: data, sequence of bytes
- Version ID: used for versioning
- Metadata: properties about the data
S3 Buckets
- Bucket names must be globally unique.
- S3 is a universal namespace: S3 names must be unique globally.
- 3 ~ 63 characters in length
- No uppercase or underscore; must start with a lowercase or number; can’t be formatted as an IP address
- Default 100 buckets per account, hard 1,000 bucket-limit via support request
- Unlimited objects and unlimited total capacity in buckets
- Each object’s size is 0 ~ 5TB
- Even though S3 is a global service, you need to select a specific region for a bucket.
- Objects are physically in the specified region but can be accessed globally
- Object URL format: https://.s3.amazonaws.com/
Uploading Objects to S3
- Uploading an object: can be done using S3 Console, CLI, or SDKs
- Single PUT upload: ~ 5GB,
- might cause the performance issue when the file is big
- if upload fails, the whole upload fails
- Multipart upload: Each part – 5MB ~ 5GB, up to 10,000 parts.
- Faster (parallel upload) and an individual upload can fail and be retried.
- Recommended over 100MB, required for bigger than 5GB.
- HTTP 200 response if the upload was successful
Data Consistency
- Read after Write Consistency: for PUTs of new objects
- You can read the file right after creating one
- Eventual Consistency: for overwrite PUTs and DELETEs
- If you update or delete an existing file, it takes some time to propagate. You might access the old version.