Amazon S3 Storage Buckets | What Are They And How They Work
What is an Amazon S3 Storage Bucket?
Most people that are just starting to learn about the cloud hear about a service known as Amazon Simple Storage Service or AWS S3 for short. After learning about this service, one of the first questions that usually comes up is what is an Amazon S3 Storage Bucket?
An Amazon S3 storage bucket is the top level storage container of the Amazon Simple Storage Service. It is used to keep a group of project related objects, or files, together in a single location. You can think of it similarly to a hard drive, except with virtually unlimited storage capacity.
Even though an Amazon S3 storage bucket is similar to a hard drive, there are many differences between them. There are also many great use cases on when and why you would want to use an Amazon S3 storage bucket over standard storage systems.
Amazon S3 Storage Bucket Options
There are several different settings that can be enabled on an Amazon S3 storage bucket. These settings dictate how the storage bucket will operate and can determine things like storage pricing, time to fetch the objects, and the availability of the stored objects.
One such feature is object versioning. When versioning is enabled on an Amazon S3 storage bucket, multiple versions of the objects are saved, and can be restored if needed. This helps with preventing accidental deletion of the objects stored within the bucket, or gives the ability to restore an unintended edit of an s3 object.
Another option available to Amazon S3 storage buckets is the ability to enable server access logs. This settings is used to get detailed logging information on any request that is made to, or from, the S3 storage bucket in question. This is very useful if any type of security audit needs to be performed on the storage bucket as it gives an audit trail of when storage bucket requests were made and who they were requested by.
Another feature similar to the server access logging is object-level logging. Enabling this will link the Amazon CloudTrail service to the AWS S3 storage bucket and this can be configured to track any read or write event to any object within the storage bucket. With this enabled, any request to an object will be tracked and can be later investigated or analyzed if needed.
One really useful feature of Amazon S3 storage buckets is the static website hosting option. This allows the owner of the bucket to place the files of a static website inside the storage bucket in order to serve that content to the world wide web. When setting this up, you only need to identify the main index document object path as well as the error document object path that should be used when an error is triggered. Redirection rules can also be configured here to redirect certain object key paths to other paths within the bucket. Another option here is to set the bucket to redirect any request to the storage bucket to another storage bucket or domain. This is useful when wanting to redirect a main top level domain to the www prefixed top level domain.
One other great feature of an Amazon S3 storage bucket is the ability to enable server side encryption of the objects stored within the storage bucket. When the AES-256 encryption method is enabled, the Amazon Simple Storage Service will use their own managed keys to encrypt the object data with the AES-256 encryption algorithm. Otherwise if the AWK-KMS option is used, the keys used for encrypting the objects will be managed by the AWS Key Management Service.
Amazon S3 Storage Bucket Is An Object Store
One major difference between the AWS S3 storage bucket and typical storage drives is that the AWS S3 storage bucket is an object store rather than the hierarchical filesystem found on most hard drives or USB drives. An object store places every object in a flat namespace while a typical filesystem would store the files in a hierarchical directory structured way. Even though the objects in the Amazon S3 storage bucket may appear to be in a directory like structure in their user interface, they are not actually stored that way in the underlying storage system.
You can think of an object store like a key value store, where an objects name is its key and the objects data is its value. The path like naming convention used by most object key names is what makes the object store seem like a typical hierarchical file system. Storing the objects in this key value fashion is what helps make looking up any object in an Amazon S3 storage bucket extremely fast, even when there are potentially billions of objects in the bucket. Using prefix based searching on the object keys, the actual location of the object in the storage bucket can be determined efficiently. Once the location is known, fetching the data is a relatively simple task.
Even though the storage bucket is an object store, any type of file can be uploaded into the storage bucket given the proper tools. The Amazon S3 storage bucket can hold any number of objects that you are able to upload to it. The only real restriction here is that the maximum object size is currently 5 terabytes.
Amazon S3 Storage Bucket Storage Classes
There are many different storage classes available to Amazon S3 storage buckets. We will give a brief overview of the ones currently available. They are designed to help Amazon customers with different storage use cases that they may have.
The original storage class is known as S3 Standard. This is the most general storage class and should be the default one that most users use, unless they have a specific reason to use one of the other classes.
This storage class, as with all other storage classes have 11 9’s of durability. What this means is that 99.999999999% of all objects stored within the bucket will not become corrupted or lost over a year long period. Given one million stored objects in the storage bucket, with this durability, you would expect to lose, or have corrupted, at most one file over a 100000 year period!
This storage class, as well as the S3 Glacier and S3 Glacier Deep Archive storage classes, are designed so that objects in the storage bucket are available 99.99% of the time during a one year period. The S3 Standard Infrequent Access storage class is designed for 99.9% availability while the S3 One Zone Infrequent Access storage class is configured for 99.5% availability.
The S3 Intelligent-Tiering storage class is a version of the storage class designed by Amazon to intelligently determine what access tier is best suited for the objects within the storage bucket. This helps optimize the cost and availability of the stored objects by using a machine learned model to select the optimal configuration. The main difference of the tiers is that one tier is for more frequent access while the other tier is for less frequent access.
The S3 Standard Infrequent Access storage class is used when the objects in the storage bucket are accessed less frequently but when accessed, still need to be retrieved quickly. The S3 One Zone Infrequent Access is similar, however the data is only replicated into a single availability zone instead of multiple available zones like the S3 Standard-IA storage class is. If the single availability zone goes down, these objects will not be available until the availability zone is restored. However, if the availability zone is destroyed, the objects stored in the S3 One Zone-IA will not be recoverable.
The remaining storage classes are known as the Amazon S3 Glacier and Amazon S3 Glacier Deep Archive storage classes. These storage classes are mainly used for data archiving, and used only when quick retrieval of the objects stored in the storage bucket is not required. These storage classes are among the cheapest available.
Retrieval times for Amazon S3 Glacier vary between minutes and hours depending on the configuration applied. However, retrieval time for Amazon S3 Glacier Deep Archive is designed for objects that would only potentially be accessed a handful of times in a year. The objects stored in Amazon S3 Glacier Deep Archive are typically ones that need to be saved for 7 to 10 years in duration.