How do you choose the best storage option on AWS?
Choosing storage service is critical when designing a cloud architecture. Read on to learn about the characteristics, limitations, typical use cases, and a decision tree for the following options to store data on AWS:
- Instance Store provides low latency and high throughput block storage for EC2 instances.
- EBS (Elastic Block Storage) provides persistent block storage for EC2 instances.
- EFS (Elastic File System) provides a scalable and fault-tolerant network file system (NFSv4).
- FSx (File System for Windows File Server) provides a fully-managed Windows File Server.
- S3 (Simple Storage Service) provides highly scalable and fault-tolerant object storage.
Looking for a comparison of all database services available on AWS instead? Check out Databases on AWS.
The following diagram was created with Cloudcraft. The diagram shows the possibility to access all of the discussed storage options from an EC2 instance. EFS and FSx are accessible from on-premises as well. S3 is even accessible from the Internet.
Before we start, I recommend answering the following questions for the workload that you have in mind when selecting a storage option.
- What are the durability and availability requirements for your workload?
- How much data do you need to store?
- What’s the baseline and burst I/O throughput required by your workload?
- What’s the interface your workload expects to read/write data? A file system? Does your workload offer an S3 integration?
- Does your workload rely on low latency when accessing data from storage? What level of latency is tolerable?
Keep those questions in mind, when reading through the brief introductions of the storage options on AWS.
EC2 Instance Store provides low latency and high throughput block storage for EC2 instances. The instance store offers access to HDDs or SSDs directly attached to the host machine of your virtual machine. Consider the instance store as ephemeral: data is lost whenever you or AWS terminates or stops the virtual machine.
AWS offers virtual machines of different instance types and groups those types into instance families. Some instance families do not only come with a specific amount of CPU and memory resources but with EC2 Instance Store as well.
|m5d.large||2||8.0 GiB||75 GiB NVMe SSD|
|c5d.large||2||5.0 GiB||50 GiB NVMe SSD|
|c5d.2xlarge||8||16.0 GiB||200 GiB NVMe SSD|
|i3en.24xlarge||96||768.0 GiB||60000 GiB (8 * 7500 GiB NVMe SSD)|
Use EC2 Instance Store when low latency and high throughput access to your data is essential. To avoid data loss, you should only store temporary data or make sure data gets replicated to a cluster of EC2 instances automatically. Typical use cases are swap volumes, distributed data stores (e.g., Elasticsearch, Kafka, …), all kinds of temporary data (e.g., caches, batch processing, …).
EBS (Elastic Block Storage)
Think of EBS as a SAN (Storage Area Network). Ony an EC2 instance can access data on an EBS volume. Typically, there is a one-to-one relationship between an EC2 instance and an EBS volume.
Typical use cases for EBS are: operating database-like systems on EC2 instances (e.g., MS SQL) or migrating Lift & Shift workloads, which require access to a file system and write data often.
EBS comes with the following advantages:
- The lifecycle of an EBS volume is independent of the EC2 instance. Stopping and terminating a virtual machine does not affect the data stored on the volume. Therefore, we consider an EBS volume as persistent storage.
- EBS replicates data among multiple disks within an availability zone by default. AWS aims for data durability of 99.8% over a year.
- Creating a snapshot of an EBS volume will backup the data among multiple availability zones.
However, there are some limitations, as well:
- Compared to the EC2 Instance Store, the network connection between the EC2 instance and the EBS volume adds latency and limits the maximum throughput.
- You need to provision the storage capacity of an EBS volume upfront. The capacity does not grow automatically at runtime. Therefore, you typically need to overprovision storage capacity.
- An EBS volume replicates data within a single availability zone. The SLA for an EC2 instance running in a single availability zone is 90%. So, you need to find a way to replicate your data into another availability zone for business-critical workloads. Unfortunately, EBS snapshots are only of limited help: risk of inconsistent data, no RTO (Recovery Time Objective) guarantees, etc.
Those limitations are good reasons to try to store your data elsewhere, for example, with one of the three storage services presented next.
EFS (Elastic File System)
EFS is based on a protocol that is older than 35 years: NFS (Network File System). NFS allows multiple machines to access the same file system via the network. However, EFS is by no means obsolete but offers a network file system read for the cloud:
- The storage capacity grows on demand; you only pay for what you use.
- Adjusting the maximum I/O throughput to your needs is possible as well: provisioned capacity or burstable performance depending on storage size.
- Data is replicated among multiple availability zones by default.
EFS is a perfect choice for workloads that you want or need to distribute among multiple machines and availability zones but require access to a shared file system. Typical examples are Content Management System (WordPress, Typo3), CI/CD (Jenkins, GitLab), Legacy Web Applications, and many more.
Keep in mind that EFS is not designed for latency-critical workloads. For example, you should not think about using EFS for a database-like system. Another significant limitation is that EFS is only accessible from Linux machines as it requires NFSv4.
FSx (File System for Windows File Server)
Generally speaking, one can say that FSx is similar to EFS. However, FSx uses the SMB protocol supported by Windows, Linux, and macOS instead of NFSv4. Therefore, FSx is the perfect choice whenever you need a network file system in a non-Linux scenario.
I need to point out the differences between FSx and EFS:
- FSx is not making use of multiple availability zones by default. However, there is an option to operate a standby server in a 2nd availability zone. FSx promises availability of 99.9%. EFS replicates data among at least three availability zones and therefore comes with an availability objective of 99.99%.
- FSx supports data deduplication, which reduces storage costs. EFS does not provide that functionality.
What is right for EFS is true for FSx as well. The shared file system is not designed for latency-critical workloads.
S3 (Simple Storage Service)
S3 is different. A REST API accessible from the Internet – so basically from anywhere – provides access to the object store. The Simple Storage Service distributes data among multiple availability zones by default. Also, the storage capacity and read/write throughput scales on-demand.
Due to its accessibility and flexibility, it is hard to find typical use cases for S3. Therefore, I’m listing some examples of what I’ve used S3 for within the last months:
- Backing up my MacBook Air.
- Hosting a static website.
- Exchanging data for analytics and machine learning between different organizations.
- Storing snapshots of Elasticsearch, a search engine and database.
- Storing user-generated content accessed by mobile devices.
- And more!
As mentioned above, S3 is providing object storage. In general, object storage fits best for workloads with write seldomly and read often file access patterns. That’s because you have to upload the whole file whenever you want to modify it. Therefore, you should not use S3 to store active log files, that a process is appending new lines to regularly, for example. However, S3 is a perfect fit to archive your log files.
To read and write data from S3, your workload needs to interact with a REST API. Nowadays, many software vendors and open-source projects come with S3 integration. But, a legacy application will probably rely on a file system and does, therefore, not qualify for storing data on S3.
The following table compares the different storage options that you have learned about so far. Most likely, you need to find the best compromise for the specific requirements of your workload.
|Interface||Block Storage||Block Storage||NFSv4||SMB||REST API|
|Accessibility||EC2||EC2||EC2, On-Premises||EC2, On-Premises||Anywhere|
|Throughput||Up to ~5,500 MB/s||Up to 2,375 MB/s||Default limit is 1,000 MB/s but can be increased||Hundreds of GB/s||Very High|
|Durability||Ephemeral||99.8%||99.999999999%||Unspecified (probably >99.8%)||99.999999999%|
|Availability Zones||Single AZ||Single AZ||Multi-AZ (default)||Multi-AZ (optional)||Multi-AZ (default)|
|Backups/Snapshots||n/a||EBS Snapshots||AWS Backup||Volume Shadow Copy Service (VSS)||n/a|
Are you overwhelmed by the options and differences? The following decision tree will guide you in the right direction.
Which storage services are you currently using? I’m keen to learn about your workloads and the storage options you have been choosing. Please leave a comment below!