Day 24 - Highly Available and Scalable Django Application on AWS using Terraform

May 12, 2026

Today I worked on deploying a highly available and scalable Django application on AWS using Terraform. The goal of this project was to understand how production-style AWS infrastructure is designed across multiple Availability Zones while keeping the application secure, scalable, and resilient.

Instead of deploying a single EC2 instance in a public subnet, this setup used private EC2 instances behind an Application Load Balancer. The infrastructure also included Auto Scaling Groups, NAT Gateways, route tables, security groups, and multi-AZ networking.

Architecture Overview

The infrastructure was deployed inside a custom VPC across two Availability Zones.

Main components used:

VPC with public and private subnets
Internet Gateway
NAT Gateways for outbound internet access
Application Load Balancer
Private EC2 instances
Auto Scaling Group
Dockerized Django application
Terraform Infrastructure as Code

VPC and Networking Design

The VPC CIDR block used was:


10.0.0.0/16

Two public subnets were created for the ALB and NAT Gateways:


10.0.1.0/24
10.0.2.0/24

Two private subnets were created for EC2 instances:


10.0.11.0/24
10.0.12.0/24

This design allowed the ALB to receive public traffic while keeping application servers private.

Security Design

The Application Load Balancer security group allowed inbound HTTP traffic from the internet.

The EC2 security group only allowed traffic from the ALB security group on port 8000.

This meant the application instances could not be accessed directly from the internet.

Application Load Balancer

The ALB distributed traffic across EC2 instances running in different Availability Zones.

Health checks were configured so unhealthy instances would automatically stop receiving traffic.

Auto Scaling Group

The Auto Scaling Group maintained:

Minimum instances: 1
Desired instances: 2
Maximum instances: 5

CPU-based scaling policies allowed the environment to automatically scale during higher load.

Dockerized Django Application

The EC2 instances used a user data script to install Docker and run the Django container automatically during startup.

Docker image used:


itsbaivab/django-app

This simplified deployment consistency across instances

NAT Gateway Usage

Because the EC2 instances were deployed in private subnets, they needed outbound internet access to download packages and Docker images.

NAT Gateways solved this problem while still keeping the instances private.

One NAT Gateway was deployed per Availability Zone for high availability.

Validation and Testing Steps

1. Validate Terraform Deployment


terraform output

Expected:


load_balancer_dns = "day24-alb-xxxxx.us-east-1.elb.amazonaws.com"
nat_gateway_1_ip  = "x.x.x.x"
nat_gateway_2_ip  = "x.x.x.x"

2. Test Application Access

Open the ALB using HTTP, not HTTPS:


http://day24-alb-1985674148.us-east-1.elb.amazonaws.com/

Expected:


Django application loads successfully

3. Validate Load Balancer

Go to:


AWS Console → EC2 → Load Balancers → day24-alb

Check:


State = Active
Scheme = internet-facing
Listener = HTTP : 80
Availability Zones = us-east-1a and us-east-1b

4. Validate Target Group Health

Go to:


AWS Console → EC2 → Target Groups → day24-tg → Targets

Expected:


2 targets registered
Health status = Healthy

5. Validate Private EC2 Instances

Go to:


AWS Console → EC2 → Instances

Check:


Instances are running
Instances are in private subnets
Public IPv4 address is blank

6. Validate Auto Scaling Group

Go to:


AWS Console → EC2 → Auto Scaling Groups → day24-asg

Check:


Min capacity = 1
Desired capacity = 2
Max capacity = 5
Health check type = ELB

7. Test High Availability

Before failure test:


Target Group → Targets

Confirm:


2 Healthy targets

Then terminate one EC2 instance manually:


EC2 → Instances → Select one day24 instance → Instance state → Terminate

Expected:


Application should still load through ALB
ALB routes traffic to remaining healthy instance
ASG launches a replacement instance

Refresh:


http://day24-alb-1985674148.us-east-1.elb.amazonaws.com/

One instance terminated.

Application still accessible.

ASG launching replacement instance.

Target group back to 2 healthy targets.

8. Validate Self Healing

After a few minutes, check:


EC2 → Auto Scaling Groups → Activity

Expected:


ASG detected terminated instance
ASG launched replacement instance
Desired capacity restored to 2

Cost Considerations

This project is closer to a production architecture, so NAT Gateway pricing becomes noticeable.

Estimated monthly cost:

EC2 instances: ~$17
ALB: ~$16
NAT Gateways: ~$65
Data transfer: ~$5 to $10

Total estimated cost:


~$103 to $108 per month

Resources should be destroyed after testing.

Cleanup


terraform destroy -auto-approve

What I Learned

This project helped me better understand how AWS networking, load balancing, scaling, and security work together in a production-style environment.

The biggest learning was seeing how private EC2 instances can still function correctly through NAT Gateways while remaining protected from direct internet access.

I also learned how Auto Scaling Groups and ALBs work together to improve both scalability and high availability.

Key Takeaways

Multi-AZ deployment improves availability
Private subnets improve security
NAT Gateways provide outbound internet access
ALB distributes traffic only to healthy targets
Auto Scaling Groups improve resilience
Terraform simplifies repeatable deployments
Docker standardizes application deployment

Video Reference

Jay

Search This Blog

Jayanth Katta | Technology, Life, Health & Learning

Day 24 - Highly Available and Scalable Django Application on AWS using Terraform

Architecture Overview

VPC and Networking Design

Security Design

Application Load Balancer

Auto Scaling Group

Dockerized Django Application

NAT Gateway Usage

Validation and Testing Steps

1. Validate Terraform Deployment

2. Test Application Access

3. Validate Load Balancer

4. Validate Target Group Health

5. Validate Private EC2 Instances

6. Validate Auto Scaling Group

7. Test High Availability

8. Validate Self Healing

Cost Considerations

Cleanup

What I Learned

Key Takeaways

Video Reference

Comments

Post a Comment

Popular posts from this blog

ASM Integrity check failed with PRCT-1225 and PRCT-1011 errors while creating database using DBCA on Exadata 3 node RAC

Life is beautiful

Lock Tables in MariaDB