[AWS] S3 Architecture

AWS S3 (Simple Storage Service) is a global object storage platform that can be used to store and retrieve any amount of data.


  • S3 is object-based: key-value pair.
    • Key: the name of an object
    • Value: data, sequence of bytes
    • Version ID: used for versioning
    • Metadata: properties about the data

S3 Buckets

  • Bucket names must be globally unique.
    • S3 is a universal namespace: S3 names must be unique globally.
    • 3 ~ 63 characters in length
    • No uppercase or underscore; must start with a lowercase or number; can’t be formatted as an IP address
  • Default 100 buckets per account, hard 1,000 bucket-limit via support request
    • Unlimited objects and unlimited total capacity in buckets
    • Each object’s size is 0 ~ 5TB
  • Even though S3 is a global service, you need to select a specific region for a bucket.
    • Objects are physically in the specified region but can be accessed globally

Uploading Objects to S3

  • Uploading an object: can be done using S3 Console, CLI, or SDKs
  • Single PUT upload: ~ 5GB,
    • might cause the performance issue when the file is big
    • if upload fails, the whole upload fails
  • Multipart upload: Each part – 5MB ~ 5GB, up to 10,000 parts.
    • Faster (parallel upload) and an individual upload can fail and be retried.
    • Recommended over 100MB, required for bigger than 5GB.
  • HTTP 200 response if the upload was successful

Data Consistency

  • Read after Write Consistency: for PUTs of new objects
    • You can read the file right after creating one
  • Eventual Consistency: for overwrite PUTs and DELETEs
    • If you update or delete an existing file, it takes some time to propagate. You might access the old version.

