Amazon Simple Storage Service (S3) was first launched in 2006, in the early years of AWS. It has since grown to be the object storage backbone of cloud computing. S3 grew so fast that by 2013 it was observed to contain five objects for every star in our galaxy. Today’s size estimate is difficult to find, presumably due to the mind-blowing exponential growth of the service since then. It is object-based cloud storage, meaning it organizes data as objects instead of in a file structure. This contributes to its being a powerful combination of muscle and user-friendliness. I’ve identified five reasons to consider using S3 after using the service for my own web app projects and data storage needs, and after seeing it in action in large companies.
Reason 1: Versatility
Amazon S3 is a surprisingly versatile service. You can use it to host static websites, store database backups and app or system log files, build a data lake, or even host a scalable serverless architecture, to name a few common use-cases.
Using S3 to create a static website, for example, is a low-cost, easily deployable option. This is a fun project for those new to the AWS cloud, given it is not difficult to set up. Large companies also use S3 to host websites, especially since another Amazon service, CloudFront, can be additionally set up to serve content (such as video, audio, or HTML). This relationship reduces latency and boosts transfer rates.
Amazon databases such as RDS and Aurora, automatically store their backups in S3. On-premises data, such as backups from an on-prem SQL Server database, can also be manually moved into S3, though this task can be fairly involved. There are no limits to the number of objects you can store in an Amazon S3 bucket, which makes it a powerful ally when it comes to backups.
Another use for S3 is to store objects (such as photos) uploaded from users. You can combine S3 with another AWS service, such as AWS Lambda (lets you run code without provisioning a server) or AWS SQS (Simple Queue Service), either of which can be configured to respond to S3 object creation events. A wide variety of elegantly decoupled serverless applications can be architected with this sort of setup in mind.
Reason 2: Cost
A common theme in AWS is that you pay for what is used, versus what is allocated. On-prem storage solutions may result in bills stemming from categories such as capital and operational costs, and even from indirect business costs. In contrast, costs from Amazon S3 can be boiled down to how much is stored in each bucket, along with any data retrieval or transfer fees.
Amazon S3 offers a variety of storage classes to match a range of use cases. Intelligent Tiering, for example, is a powerful way to reduce the storage bill for objects where the access patterns are random or unknown. It will analyze access patterns, and automatically move data to a lower cost tier when possible. Check out our 1Strategy blog post on Intelligent Tiering.
S3 also lets you automate object lifecycles with an easy configuration option. This lets you define when objects should expire (gone are the days of sitting next to a file cabinet, sorting through the files that need to be shredded!), or when an object needs to transition to another storage class. For example, if logs should be archived after a certain period, a lifecycle can be set to move the data to AWS Glacier which has three types of access options to further reduce costs. This can all be done without impacting performance.
We think about reducing costs in the AWS Cloud all the time; see our blog post on cost optimization on AWS for more on this topic.
Reason 3: Secure Storage
There is a foundational level of security that Amazon S3 provides. For example, your S3 bucket is, upon initial creation, only accessible by the resource owner. If you are determined to open your bucket up for public access, extra effort must be made to ultimately reduce the bucket’s security.
Configuration of your bucket is flexible, but highly important to take advantage of the hefty security you can enable. For example, bucket policies and Access Control Lists (ACLs) may be added to allow very selective permissions to other users and accounts. You can additionally require Multi-Factor Authentication, or only allow in specific IP addresses. AWS Identity and Access management (IAM) is a powerful ally to S3 and should be employed to produce a very detailed level of bucket or even folder accessibility using user policies.
Extra security might also mean enabling encryption of the objects in your bucket. This is a surprisingly straightforward configuration to activate in S3. To further protect data in transit, you can encrypt data on the client side, and then upload it to S3. For more on this, and the options available to do this, see more on AWS.
Reason 4: Resiliency
Resiliency is how well an architectural solution can continue providing service even when there is a disruption, such as an infrastructure or service issue.
One of the more impressive characteristics of Amazon S3 is how well it handles failures. It does this in part by automatically storing every object at minimum in three Availability Zones, providing a head-turning 99.999999999% durability and 99.99% availability of objects over the course of a year. Scaling is also automated in S3. Besides being able to store an unlimited amount of data in S3, scaling allows for a quick response to high request rates, allowing for a fluid response to change in demand, and the ability for a business to grow.
Amazon S3 offers versioning to provide additional protection against application failures—and to prevent teammates from accidentally deleting data—which increases resiliency. You can also replicate an entire bucket in order to transfer data to an entirely different AWS Region for backup.
Amazon CloudWatch is another service that pairs nicely with Amazon S3. It allows you to enable metrics, whether on bucket storage or on S3 requests, to automate monitoring and alerts. Such automation is hugely helpful for the engineering team and contributes to S3’s resiliency.
Reason 5: Performance
Many of the same features already mentioned carry over to Amazon S3’s performance abilities. It is super durable, swiftly scalable (allowing it to quickly respond to an increase in request rates), and teams up nicely with other services to further refine performance.
When paired up with Amazon ElastiCache, for example, you can boost transfer rates over a single HTTP connection and lower latencies to single-digits in milliseconds. Or, you can team S3 up with Amazon CloudFront to make content easily available to folks around the world.
There are additionally a number of best practices available on AWS that can help further improve S3 performance. If you have an Amazon Elastic Compute Cloud (EC2) instance that is used to access an S3 bucket, you can lower network latency by making sure that you make the bucket in the same Region as that of the EC2 instance. Consider, too, what version of the AWS SDK (APIs that are language-specific for AWS services) you may be using to interact with S3 with from within an application; later versions provide automatic retry requests on HTTP 503 errors. Check out the best practices guidelines and Performance Design Patterns provided by AWS to explore more ways to mold S3’s performance to best serve your application.
Amazon S3 was one of the first AWS services I used as an engineer. It is a great way to jump into cloud computing, and AWS specifically, since it is both rich in documentation and (at least initially) has a gentle learning curve. However, don’t be fooled—as I once was—into believing that Amazon S3 is a monochromatic service. It is versatile and cost-effective cloud storage that offers security, resiliency, and performance.
Want help tackling Amazon S3 or other daunting tasks in AWS? Not sure which storage solution is the best fit for your organization, or where to start? We are here to help! Just reach out to us at email@example.com, we would love to work with you!