Running an application in multiple availability zones (data centers) in a single region is a best practice when architecting on AWS. Interested in learning more about multi-AZ? Take a look at our previous post here.
Depending on your availability or latency requirements, you might need to deploy your application among multiple regions. For example, use US East (Ohio), US West (Oregon), and EU West (Ireland) to operate your application close to your users. In this blog post, I will demonstrate how Multi-Region architectures look. We will also discuss the main challenges: replicating state and ingress routing.
Architecture Overview
Running an application in multiple regions requires us to set up the infrastructure for our application in each region. For a typical web application, the infrastructure includes:
- load balancer
- compute layer (EC2, ECS, EKS, Lambda)
- datastore
You create the infrastructure as before in each region.
Pro Tip: If you invested in Infrastructure as Code, you can easily recreate your environment in another region with CloudFormation or Terraform.
The following figure shows the overall architecture.
Two challenges need further investigation:
- How can you route your users to the closest available region?
- How can you replicate the data between the data stores in the different regions?
Let’s have a look.
Ingress Routing
What is closest, and what is available? To calculate how close a user is to a region, either the latency or the geographical distance is used. The availability of a region is determined by health checks that constantly send requests to your application in each region.
The following options are available on AWS:
- Route53
- Global Accelerator
The capabilities are similar, but the technology is different. Route53 offers DNS technology. Your users might not resolve names fast enough for a quick failover. Global Accelerator exposes two static Anycast IP addresses for you. Anycast means that multiple servers use the same IP address. Routers decide where to go by selecting the shortest path. Additionally, Global Accelerator routes traffic into the AWS backbone as early as possible to reduce latency. Keep in mind that Route53 is cheaper compared to Global Accelerator.
Replicating State
So far, a user is always routed to the closest available region. But what happens if a user boards a plane to travel between continents. Even worse, what happens if one region goes down. How can we ensure that user data is available in the right region? The straightforward solution is to replicate all your data to all your regions. Sounds easy? One thing to understand is that cross-region replication is asynchronous. Synchronous replication would be too slow because of the latency caused by the long distances. That’s why almost all data is replicated asynchronously. That works great until the same data item is updated in multiple regions. That’s why some solutions allow write-only in a single region while reads can happen across the globe.
The following AWS datastores come with support for cross-region replication:
- DynamoDB
- RDS Aurora
- RDS
- S3
- ElastiCache
- DocumentDB
The comparison table at the end of this section compares the different options. Before we look into the comparison, you have a closer look at S3 cross-region replication.
S3 Bucket Cross-Region Replication configuration
Replicating S3 buckets is a little harder than it should be. You have to create a replication configuration between each bucket in both directions.
If you have two buckets in us-east-1 and us-west-2, you need to replicate:
- us-east-1 to us-west-2
- us-west-2 to us-east-1
This way, you can write to either us-east-1 or us-west-2 and see the object in the other bucket. If you want to have buckets in 3 regions, things get more complex. You need to replicate:
- us-east-1 to us-west-2
- us-east-1 to eu-west-1
- us-west-2 to us-east-1
- us-west-2 to eu-west-1
- eu-west-1 to us-east-1
- eu-west-1 to us-west-2
If you need four regions, you will end up with 12 connections and so on. The following figure shows the configuration for three regions.
As promised, here is a comparison table with the available datastore options that support cross-region replication.
DynamoDB global tables | RDS Aurora global databases | S3 Cross-Region Two-Way Replication | ElastiCache for Redis Global Datastore | RDS Read Replica | Document DB Global Clusters | |
Multi-Region write support | yes | no | yes | no | no | no |
Writer-Region Failover | n/a (not needed) | automated | n/a (not needed) | manual | manual | manual |
Propagation delay | ~ <1 second | ~ <1 second | ~ <15 minutes | <~1 second | ~ <1 second | ~ <1 second |
Conflict resolution | last writer wins | n/a (writes can only happen in one region) | all versions are replicated, order might differ | n/a (writes can only happen in one region) | n/a (writes can only happen in one region) | n/a (writes can only happen in one region) |
Multi-Region transactions support | no | n/a (writes can only happen in one region) | no | n/a (writes can only happen in one region) | n/a (writes can only happen in one region) | n/a (writes can only happen in one region) |
Multi-Region read support | yes | yes | yes | yes | yes | yes |
Maximum regions | all | 1+5 | all | 1+5 | 1+5 | 1+5 |
Serverless provisioning | yes (configurable) | no | yes | no | no | no |
Summary
Until today, Multi-Region deployments are exceptions, not the norm. Luckily, routing user traffic into different regions is a solved problem on AWS. You can rely either on DNS with Route53 or Anycast with Global Accelerator.
Remember, not all datastores offer the same capabilities. When selecting a datastore, it is crucial to understand the differences as highlighted in the comparison table.
With all that in mind you can architect multi region systems on AWS today. Multi-Region AWS architectures are more complex and expensive compared to a single region deployment. Make sure that there is a real business case for the availability requirements.