AWS S3 is a main Object and File storage service and can be integrated with many other AWS Services. Also, S3 has many features you need to be familiar with in order to utilize the full functionality of S3.
Optimizing S3 Performance
- Amazon S3 automatically scales to high request rates.
Use Prefixes
- You can get a better performance by spreading your objects across different prefixes (folders).
- 5,500 GET requests per second per prefix. If you are using 4 prefixes, you would achieve 22,000 requests per second.
- 3,500 PUT/COPY/POST/DELETE request per second per prefix.
- Previously, the randomizing prefix naming with hashed characters is recommended for performance. But you do not need to randomize the prefix names anymore. You can use sequential date-based naming for your prefixes.
Check KMS Limits
- If you use SSE-KMS, you must keep in mind the KMS quota as well.
- Upload -> need to call “GenerateDataKey” API
- Download -> need to call “Decrypt” API
Multipart Uploads
- Recommended for files over 100 MB
- Required for files over 5 GB
- Increase efficiency using the parallel uploads
Byte-Range Fetches
- You can download an object as small chunks (parallel download).
S3 Object Lock
- S3 Object lock enables you to store objects using a Write-once-Read-many (WORM) model.
- It is used for regulatory requirements (WORM storage) or the layer of protection.
- Object locks can be applied on individual objects or across the bucket as a whole.
- Once an object is locked, the object cannot be modified or deleted for a fixed amount of time or indefinitely.
- Retention Period
- S3 Object lock protects an object for a specified amount of time.
- S3 stores a timestamp in the object’s metadata. After the retention period expires, an object can be overwritten or deleted.
- S3 Object lock protects an object for a specified amount of time.
- Legal Holds
- When you place a legal hold on an S3 object, you cannot modify or delete the object as long as the legal holds remains effect.
- You can freely attach or remove a legal hold on an S3 object.
- The “s3:PutObjectLegalHold” permission is required.
- Lock Modes
- Governance Mode: A user needs special permission to change the retention period or delete objects.
- Compliance Mode: An object cannot be overwritten or deleted by any user (even a root user). In this mode, the retention period cannot be modified.
Glacier Vault Lock
- S3 Glacier Vault lock is used to enforce compliance controls to S3 Glacier vaults with a vault lock policy. It provides the locking mechanism in the Glacier.
- Once locked, the policy can no longer be changed.
S3 Select and Glacier Select
- S3 Select is used to retrieve a subset of data from an object using simple SQL expressions by rows and columns.
- It can boost performance increases of your underlying applications (up to 400%).
- You can save money on data transfer.
- Similarly, Glacier Select allows to run simple SQL expressions to Glacier storage.
S3 Presigned URLs
A presigned URL provides access to an object using the creator’s access permissions and is used to download/upload objects (GET and PUT operations).
- It can be used for anonymous users to access an object.
- URL is temporary and will be expired (7-day max).
- You may get an error when you use presigned URLS when:
- expired (7-day max)
- the permission of a creator has been changed
- the temporary credential of a role is expired (if the URL is created using a role. – 36 hours)
S3 Replication
You can replicate objects from one bucket to another regardless of regions.
- Previously it was called S3 Cross-region Replication (S3 CRR), which allows one-way replication to another region.
- Now, the replication is allowed between any 2 buckets.
Requirements
- Versioning must be enabled both in a source bucket and a destination bucket.
- An IAM role is required to replicate objects.
Features
- Replicate objects keep their storage class, object name (key), owner, object permissions by default. But it is possible to override the storage class, owner, and permissions.
- The replication is one-way only.
- The replication only happens to the object that is added after the configuration. (Not retroactive)
- All subsequently updated files will be replicated automatically.
- What’s replicated
- SSE-S3 and SSE-KMS encrypted objects or (No Encryption) can be replicated.
- What’s NOT replicated
- SE-C objects
- Delete markers
- System actions (Lifecycle events)
- Any objects before replication is enabled
S3 Transfer Acceleration (S3TA)
S3 Transfer Acceleration provides fast and secure data transfers to the S3 buckets.
- Transfer data faster over long distance
- It takes advantage of CloudFront’s globally distributed edge locations and AWS backbone networks for faster transfers.
- Rather than uploading files directly to S3 buckets, you are uploading files to the closest edge location and then transfer the files to S3 using AWS networks.
- The feature is enabled per bucket.
- A distinct URL is used to upload files.
- ex) <bucketname>.s3-accelerate.amazonaws.com
- Additional cost per usage (GB)
- S3TA vs. CloudFront PUT/POST
- Use S3TA when a higher throughput is required and you want to use all bucket-level features such as multipart uploads
- Consider CloudFront PUT/POST of the data set is less than 1GB in size.
S3 Event Notification
- S3 notification feature enables you to receive notifications when certain events (Create, Delete, Restore) happen in your bucket.
- S3 Supports the following destinations: SNS Topic, SQS Queue, Lambda Function
Cross-origin resource sharing (CORS)
- CORS defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.
- You can write CORS configuration in JSON.
Example CORS Configuration
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"PUT",
"POST",
"DELETE"
],
"AllowedOrigins": [
"http://www.example1.com"
],
"ExposeHeaders": []
},
{
"AllowedHeaders": [],
"AllowedMethods": [
"GET"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": []
}
]