Deploying a Secure Django App on AWS ECS Using Terraform and GitHub Actions
The Problem We’re Solving
In modern DevOps, deploying applications securely is non-negotiable—especially when dealing with production workloads.
But here’s the problem: Most ECS tutorials focus on “how to deploy fast” instead of “how to deploy securely.”
This project intentionally takes the security-first path, accepting some additional cost and complexity to achieve:
Private networking for containers
Secure image pulls without exposing the VPC to the public internet
Encrypted, managed databases
Clear separation between DevOps automation and runtime operations
That’s the problem this project solves.
Architecture Explained
High-Level Overview
Django App (Dockerized)
The application
Terraform
Infrastructure as Code (IaC)
ECS Fargate
Serverless container orchestration
RDS (PostgreSQL)
Managed database
ALB (Application Load Balancer)
Frontend routing
VPC Endpoints (Interface)
Private networking for ECR, S3, CloudWatch
CloudWatch Logs
Centralized logging
GitHub Actions
CI/CD pipeline
Docker + ECR
Container image build & storage
Architecture Diagram

File Structure Overview
.
├── app # Django Application
│ ├── dockerfile
│ ├── entrypoint.sh
│ └── hello_world_django_app
├── infrastructure # Terraform IaC
│ ├── main.tf
│ └── modules
│ ├── computes
│ └── subnets
├── .github/workflows # Github Action Workflow
└── README.md
Workflow Breakdown
Build
Docker image creation
Push
Push image to ECR
Infrastructure
Terraform apply ECS, RDS, ALB
Deploy
ECS pulls image and serves app
Destroy
Optional cleanup step
Exit
Workflow cancellation
Explanation of the Chosen Services
ECS Fargate
Security: No SSH, runs in private subnet, AWS-managed runtime. Cost: Pay per vCPU and memory; slightly more expensive for long-running tasks than EC2. HA: Automatically spans multiple AZs. Complexity: Easier to operate but less control over OS-level configs.
Application Load Balancer (ALB)
Security: Terminates HTTPS, handles SSL/TLS certificates securely. Cost: ~$18–$20/month base + traffic costs. HA: Regional service, auto-scales and load balances across AZs.
Complexity: Adds config overhead if using path-based routing or multiple target groups.
VPC with Public/Private Subnets
Security: Public ALB, private ECS tasks & RDS. Minimizes surface area. Cost: No direct cost, but subnet design affects resource placement and networking choices. HA: Subnets in multiple AZs for failover. Complexity: More complex Terraform code; requires careful design to avoid misconfiguration.
VPC Endpoints (S3, ECR, Logs, Secrets Manager)
Security: Keeps traffic private; no internet exposure for pulls/logs. Cost: ~$7.3/month per interface endpoint (e.g., 4 endpoints = ~$29.2). HA: Requires per-AZ deployment for true HA. Complexity: Each service needs a separate endpoint; setup can get messy fast.
Amazon RDS (PostgreSQL Multi-AZ)
Security: Encrypted at rest & in transit; runs in private subnet. Cost: ~$30–$40/month for dev size; production costs much higher. HA: Multi-AZ failover, automated backups. Complexity: No OS-level access; bound to AWS maintenance windows.
CloudWatch Logs (via VPC Endpoint)
Security: Logs sent privately via VPC endpoint. Cost: ~$7.3/month for endpoint + $0.50/GB logs. HA: AWS-managed; no single point of failure.Complexity: Needs careful retention management or costs can spiral.
Code Explanation
Containerize Django App
App File structure
├── app # Django Application
│ ├── .dockerignore
│ ├── requirement.txt
│ ├── dockerfile
│ ├── entrypoint.sh
│ └── hello_world_django_app
In ./dockerfile
section:
# Stage 1: Build dependencies and install Python packages
FROM python:3.11-slim as builder
WORKDIR /app
# Install build dependencies only in builder stage
RUN apt-get update && apt-get install -y gcc libpq-dev netcat-openbsd && \
apt-get clean && rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
# Install dependencies into a specific directory to copy later
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
COPY . .
# Stage 2: Runtime minimal image
FROM python:3.11-slim
WORKDIR /app
# Copy only the installed packages from builder
COPY --from=builder /install /usr/local
# Copy the application source code
COPY --from=builder /app .
EXPOSE 80
RUN chmod +x ./entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
Started To copy important files and install the right dependencies that application need to work.
Used Slim
distro and Multi-stage
to tried to minimize the use of the RUN & COPY
command so i make less layers in the Image.
In ENTRYPOINT
Used Script that will run the App right.
In ./entrypoint.sh
section:
#!/bin/sh
echo "Applying database migrations..."
python manage.py migrate
echo "Creating superuser if not exists..."
echo "from django.contrib.auth import get_user_model; \
User = get_user_model(); \
User.objects.filter(username='admin').exists() or \
User.objects.create_superuser('admin', 'admin@example.com', 'adminpass')" | python manage.py shell
echo "Starting server..."
exec gunicorn hello_world_django_app.wsgi:application --bind 0.0.0.0:80
Commands that the app need after communicate with Dev Team.
Infrastructure Using Terraform
Infrastructure File structure
├── infrastructure
│ ├── main.tf
│ ├── modules
│ │ ├── computes
│ │ │ ├── main.tf
│ │ │ ├── output.tf
│ │ │ └── variable.tf
│ │ └── subnets
│ │ ├── main.tf
│ │ ├── output.tf
│ │ └── variables.tf
│ ├── output.tf
│ ├── provider.tf
│ ├── terraform-dev.tfvars
│ ├── terraform-prod.tfvars
│ └── variables.tf
Modules
Subnets
Anything I use in VPC
Network is in this Directory/Folder.
In subnets/variables.tf
section:
variable "vpc_id" {}
variable "vpc_cidr" {}
variable "subnet_az" {
type = list(string)
}
variable "env" {}
variable "region" {}
variable "vpc_endpoint_sg" {
type = string
}
This is all variable that subnets need to work.
In subnets/main.tf Section:
# Public Subnets Configuration
resource "aws_subnet" "public_subnet" {
count = 2
vpc_id = var.vpc_id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index)
availability_zone = var.subnet_az[count.index]
tags = {
Name = "${var.env}-public-subnet-${count.index}"
}
}
# Private Subnet Configuration
resource "aws_subnet" "private_subnet" {
count = 2
vpc_id = var.vpc_id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index + 2)
availability_zone = var.subnet_az[count.index]
tags = {
Name = "${var.env}-private-subnet-${count.index}"
}
}
Configured 4 Subnets
2 Public Subnets
2 Private Subnets
Used Count
so i can repeat the creation of Subnet Twice, also Used cidrsubnet()
# Endpoints Configuration
resource "aws_vpc_endpoint" "ecr_dkr" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.ecr.dkr"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private_subnet[*].id
private_dns_enabled = true
security_group_ids = [var.vpc_endpoint_sg]
tags = {
Name = "${var.env}-ecr-endpoint-data-plane"
}
}
resource "aws_vpc_endpoint" "ecr_api" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.ecr.api"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private_subnet[*].id
private_dns_enabled = true
security_group_ids = [var.vpc_endpoint_sg]
tags = {
Name = "${var.env}-ecr-endpoint-control-plane"
}
}
resource "aws_vpc_endpoint" "s3_gateway" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = [aws_route_table.private_rtb.id]
tags = {
"Name" = "${var.env}-s3-gateway"
}
}
resource "aws_vpc_endpoint" "cloudwatch_logs" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.logs"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private_subnet[*].id
private_dns_enabled = true
security_group_ids = [var.vpc_endpoint_sg]
}
Configured 4 Endpoints
3 Interface endpoint (ECS dkr, ECS API, CloudWatch Logs)
DKR => for Push/Pull Image Predefined URL from ECR
API => for ECS To request Tokens from ECR
Logs => for troubleshooting if there is any error in the ECS Image after pull
1 Gateway (S3)
S3 => for ECS so it can request the Layers after taking the Predefined URL from the ECR
Used vpc_id
, region
, and vpc_endpoint_sg
as Variable because they fill by the root/main.tf
# Public Route Table Configuration
resource "aws_route_table" "public_rtb" {
vpc_id = var.vpc_id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
Name = "${var.env}-public-rtb"
}
}
resource "aws_route_table_association" "public_subnet_assoc" {
count = 2
subnet_id = aws_subnet.public_subnet[count.index].id
route_table_id = aws_route_table.public_rtb.id
}
# Private Subnet Route table Configuration
resource "aws_route_table" "private_rtb" {
vpc_id = var.vpc_id
tags = {
Name = "${var.env}-private-rtb"
}
}
resource "aws_route_table_association" "private_subnet_assoc" {
count = 2
subnet_id = aws_subnet.private_subnet[count.index].id
route_table_id = aws_route_table.private_rtb.id
}
Configured Two Route table
One Public Route table
One Private Route table
Used Public Route table to connect to Public subnet and when configure Application Load balancer the users can access it via Internet gateway.
Used Private Route table to connect Private subnet so I can secure the ECS and RDS DB.
# Accessing our network Using IGW
resource "aws_internet_gateway" "igw" {
vpc_id = var.vpc_id
tags = {
Name = "${var.env}-igw"
}
}
Configured One Internet Gateway
Used Internet Gateway, so User can access the Application through it.
In subnets/output.tf
section:
output "public_subnet_1" {
value = aws_subnet.public_subnet[0]
}
output "public_subnet_2" {
value = aws_subnet.public_subnet[1]
}
output "private_subnet_1" {
value = aws_subnet.private_subnet[0]
}
output "private_subnet_2" {
value = aws_subnet.private_subnet[1]
}
the output that I will pass it to other services
Computes
Anything I use in computes
Services is in this Directory/Folder.
In computes/variable.tf
variable "repo_name" {}
variable "cluster_name" {}
variable "network_mode" {}
variable "cluster_region" {}
variable "ecs_type" {}
variable "memory_size" {}
variable "cpu_size" {}
variable "container_port" {}
variable "host_port" {}
variable "env" {}
variable "desired_containers" {}
variable "public_ip" {}
variable "vpc_id" {}
variable "db_name" {}
variable "db_username" {}
variable "db_password" {}
variable "db_endpoint" {}
variable "db_port" {}
variable "service_subnets" {
type = list(string)
}
variable "service_security_groups" {
type = list(string)
}
variable "alb_target_type" {}
variable "alb_subnets" {
type = list(string)
}
variable "alb_security_groups" {
type = list(string)
}
This is all variable that ECS need to work.
In computes/main.tf
section:
# ECR
resource "aws_ecr_repository" "my-app" {
name = var.repo_name
}
Configured ECR with Dynamic name and pass it when i need it in other service.
# ECS Configurations
resource "aws_ecs_cluster" "ecs_cluster" {
name = var.cluster_name
}
resource "aws_ecs_service" "my_app_service" {
name = "${var.cluster_name}-service"
cluster = aws_ecs_cluster.ecs_cluster.id
task_definition = aws_ecs_task_definition.my_app_task.arn
launch_type = "${var.ecs_type}"
desired_count = var.desired_containers
depends_on = [
aws_iam_role_policy_attachment.ecs_execution_role_policy,
aws_iam_role_policy_attachment.ecs_execution_ecr_vpc_attach,
aws_iam_policy_attachment.ecs_task_s3_attach,
aws_lb_listener.ecs_alb_listener
]
load_balancer {
target_group_arn = aws_lb_target_group.ecs_tg.arn
container_name = "${var.repo_name}"
container_port = var.container_port
}
network_configuration {
subnets = var.service_subnets
security_groups = var.service_security_groups
assign_public_ip = var.public_ip
}
}
Created ECS Cluster, and ECS Service
# ECS Task Definition Configuration
data "aws_caller_identity" "current" {} # to get your Account ID
resource "aws_ecs_task_definition" "my_app_task" {
family = "${var.cluster_name}_task"
requires_compatibilities = ["${var.ecs_type}"]
network_mode = var.network_mode
cpu = var.cpu_size
memory = var.memory_size
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
task_role_arn = aws_iam_role.ecs_task_role.arn
container_definitions = jsonencode([
{
name = "${var.repo_name}"
image = "${data.aws_caller_identity.current.account_id}.dkr.ecr.${var.cluster_region}.amazonaws.com/${aws_ecr_repository.my-app.name}:latest"
essential = true
environment = [
{
name = "POSTGRES_DB"
value = var.db_name
},
{
name = "POSTGRES_USER"
value = var.db_username
},
{
name = "POSTGRES_HOST"
value = var.db_endpoint
},
{
name = "POSTGRES_PORT"
value = tostring(var.db_port)
},
{
name = "POSTGRES_PASSWORD"
value = var.db_password
}
]
portMappings = [
{
containerPort = var.container_port
hostPort = var.host_port
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = "/ecs/${var.repo_name}"
awslogs-region = "${var.cluster_region}"
awslogs-stream-prefix = "ecs"
}
}
}
])
depends_on = [ aws_ecr_repository.my-app ]
}
resource "aws_cloudwatch_log_group" "ecs_logs" {
name = "/ecs/${var.repo_name}"
retention_in_days = 7
}
Used data aws_caller_identity{}
to get the Account ID, the uses of it when you have a Cross-account ECR.
Used aws_ecs_task_definition
to set up the Container settings like what cluster it will be in it, resources need it, IAM, Image URL, and finally environment variable or Secrets of the app in the image.
The ECS task definition depends on the ECR repository to get the image URL, which is why aws_ecs_task_definition
cannot be created before the ECR repository is initialized.
Used aws_cloudwatch_log_group
to get logs of the images.
# IAM Role Configuration
resource "aws_iam_role" "ecs_task_execution_role" {
name = "${var.env}-ecs-task-execution-role"
assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Effect = "Allow",
Principal = {variable "environment" {
description = "The environment for the resources"
type = string
default = "dev"
}
variable "cidr_block" {
description = "The CIDR block for the Network"
type = list(object({
name = string
cidr = string
}))
}
variable "az" {
description = "The Availability Zone for the Subnet"
type = list(string)
}
variable "region" {
description = "The Region that i will implement my Infra in AWS"
default = "us-east-1"
}
variable "container_port" {}
variable "cpu" {}
variable "memory" {}
variable "db_master_password" {
description = "Master password for RDS"
type = string
sensitive = true
}
variable "image_name" {
description = "Contains the image name"
type = string
}
Service = "ecs-tasks.amazonaws.com"
},
Action = "sts:AssumeRole"
}]
})
tags = {
Name = "${var.env}-ecs-task-execution-role"
}
}
resource "aws_iam_role_policy_attachment" "ecs_execution_role_policy" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_iam_policy" "ecs_execution_ecr_vpc_policy" {
name = "${var.env}-ecs-execution-ecr-vpc-policy"
policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Effect = "Allow",
Action = [
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"secretsmanager:GetSecretValue",
"kms:Decrypt",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
Resource = "*"
}
]
})
}
resource "aws_iam_role_policy_attachment" "ecs_execution_ecr_vpc_attach" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = aws_iam_policy.ecs_execution_ecr_vpc_policy.arn
}
resource "aws_iam_role" "ecs_task_role" {
name = "${var.env}-ecs-task-role"
assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Effect = "Allow",
Principal = {
Service = "ecs-tasks.amazonaws.com"
},
Action = "sts:AssumeRole"
}]
})
tags = {
Name = "${var.env}-ecs-task-role"
}
}
resource "aws_iam_policy" "ecs_task_s3_policy" {
name = "${var.env}-ecs-task-s3-policy"
description = "Policy for ECS tasks to access S3 bucket"
policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Effect = "Allow",
Action = [
"s3:GetObject",
"s3:ListBucket"
],
Resource = ["*"]
}
]
})
}
resource "aws_iam_policy_attachment" "ecs_task_s3_attach" {
name = "${var.env}-ecs-task-s3-attach"
roles = [aws_iam_role.ecs_task_role.name]
policy_arn = aws_iam_policy.ecs_task_s3_policy.arn
}
Used "aws_iam_role" "ecs_task_execution_role"
to tell which service could use this policies that i will attach to it later.
In ecs_execution_role_policy
starts to attach the Policies that the ECS will need for Pull,Push, Authorization, and etc...
In ecs_task_s3_policy
from the name its obvious what it will do, as it will Get/List objects from S3, in the ECS it will Get the Image Layer.
# ALB
resource "aws_lb" "ecs_alb" {
name = "ecs-alb"
internal = false
load_balancer_type = "application"
security_groups = var.alb_security_groups
subnets = var.alb_subnets
enable_deletion_protection = false
tags = {
Name = "ecs-alb"
}
}
resource "aws_lb_target_group" "ecs_tg" {
name = "ecs-target-group"
port = 80 # default port to the audience
protocol = "HTTP"
target_type = "${var.alb_target_type}"
vpc_id = var.vpc_id
health_check {
path = "/"
protocol = "HTTP"
port = 80
}
}
resource "aws_lb_listener" "ecs_alb_listener" {
load_balancer_arn = aws_lb.ecs_alb.arn
port = 80
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.ecs_tg.arn
}
}
Used aws_lb
to create ALB. Used aws_lb_target_group
to target the ECS with the port dynamic port with the container.
In computes/output.tf
section:
output "ecr_repo_url" {
value = aws_ecr_repository.my-app.repository_url
}
Finally, Used output block ecr_repo_url
to get ECR URL to pass it to other service.
Root Configurations
Let's get out of Modules thing and Integrate every configuration we did.
In ./variable.tf
section:
variable "environment" {
description = "The environment for the resources"
type = string
default = "dev"
}
variable "cidr_block" {
description = "The CIDR block for the Network"
type = list(object({
name = string
cidr = string
}))
}
variable "az" {
description = "The Availability Zone for the Subnet"
type = list(string)
}
variable "region" {
description = "The Region that i will implement my Infra in AWS"
default = "us-east-1"
}
variable "container_port" {}
variable "cpu" {}
variable "memory" {}
variable "db_master_password" {
description = "Master password for RDS"
type = string
sensitive = true
}
variable "image_name" {
description = "Contains the image name"
type = string
}
This is all variable that All services need to work.
In ./main.tf
section:
resource "aws_vpc" "main" {
cidr_block = var.cidr_block[0].cidr
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "${var.environment}-${var.cidr_block[0].name}"
}
}
resource "aws_default_route_table" "default_rtb" {
default_route_table_id = aws_vpc.main.default_route_table_id
tags = {
Name = "${var.environment}-default-rtb"
}
}
module "subnet" {
source = "./modules/subnets"
vpc_id = aws_vpc.main.id
vpc_cidr = aws_vpc.main.cidr_block
region = var.region
subnet_az = var.az
env = var.environment
vpc_endpoint_sg = aws_security_group.vpc_endpoints_sg.id
}
Configured VPC to connect all subnets in Same network, and Used aws_default_route_table
to attach every subnets to it so they can communicate together easily.
Used module "subnet"
to call our resources from Deploying a Secure Django App on AWS ECS Using Terraform and GitHub Actions Section.
# Computing - AWS ECS
module "computes" {
source = "./modules/computes"
env = var.environment
vpc_id = aws_vpc.main.id
repo_name = var.image_name
cluster_name = "my-app-cluster"
cluster_region = var.region
ecs_type = "FARGATE"
network_mode = "awsvpc"
memory_size = var.memory
cpu_size = var.cpu
desired_containers = 3
container_port = var.container_port
host_port = var.container_port
service_subnets = [module.subnet.private_subnet_1.id, module.subnet.private_subnet_2.id]
service_security_groups = [aws_security_group.ecs_sg.id]
public_ip = false
alb_subnets = [module.subnet.public_subnet_1.id, module.subnet.public_subnet_2.id]
alb_security_groups = [aws_security_group.ecs_sg.id, aws_security_group.alb_sg.id]
alb_target_type = "ip"
db_name = aws_db_instance.rds_postgresql.db_name
db_username = aws_db_instance.rds_postgresql.username
db_password = aws_db_instance.rds_postgresql.password
db_endpoint = aws_db_instance.rds_postgresql.address
db_port = aws_db_instance.rds_postgresql.port
depends_on = [aws_db_instance.rds_postgresql]
}
Used module "computes"
to call our resources from Deploying a Secure Django App on AWS ECS Using Terraform and GitHub Actions Section.
# Database - AWS RDS
resource "aws_db_instance" "rds_postgresql" {
db_name = "hello_db"
identifier = "postgres-db"
username = "hello_user"
password = var.db_master_password
allocated_storage = 20
storage_encrypted = true
engine = "postgres"
engine_version = "14"
instance_class = "db.t3.micro"
apply_immediately = true
publicly_accessible = false # default is false
multi_az = true # using stand alone DB
skip_final_snapshot = true # after deleting RDS aws will not create snapshot
copy_tags_to_snapshot = true # default = false
db_subnet_group_name = aws_db_subnet_group.db_attach_subnet.id
vpc_security_group_ids = [aws_security_group.ecs_sg.id, aws_security_group.vpc_1_security_group.id]
auto_minor_version_upgrade = false # default = false
allow_major_version_upgrade = true # default = true
backup_retention_period = 0 # default value is 7
delete_automated_backups = true # default = true
tags = {
Name = "${var.environment}-rds-posgress"
}
}
resource "aws_db_subnet_group" "db_attach_subnet" {
name = "db-subnet-group"
subnet_ids = [
"${module.subnet.private_subnet_1.id}",
"${module.subnet.private_subnet_2.id}"
]
tags = {
Name = "${var.environment}-db-subnets"
}
}
Configured Postgresql with need it requirment to be High available, Secure ,and Backup for disaster recovery.
# Security - AWS SG
resource "aws_security_group" "vpc_endpoints_sg" {
name_prefix = "${var.environment}-vpc-endpoints"
description = "Associated to ECR/s3 VPC Endpoints"
vpc_id = aws_vpc.main.id
ingress {
description = "Allow Nodes to pull images from ECR via VPC endpoints"
protocol = "tcp"
from_port = 443
to_port = 443
security_groups = [aws_security_group.ecs_sg.id]
}
ingress {
protocol = "tcp"
from_port = var.container_port
to_port = var.container_port
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "ecs_sg" {
name_prefix = "${var.environment}-ecs-sg"
description = "Associated to ECS"
vpc_id = aws_vpc.main.id
ingress {
protocol = "tcp"
from_port = var.container_port
to_port = var.container_port
security_groups = [aws_security_group.alb_sg.id]
}
ingress {
protocol = "tcp"
from_port = 443
to_port = 443
security_groups = [aws_security_group.alb_sg.id]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "alb_sg" {
name_prefix = "${var.environment}-alb-sg"
description = "Associated to alb"
vpc_id = aws_vpc.main.id
ingress {
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
protocol = "tcp"
from_port = var.container_port
to_port = var.container_port
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
# VPC 1 Security group
resource "aws_security_group" "vpc_1_security_group" {
vpc_id = aws_vpc.main.id
# Add RDS Postgres ingress rule
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.ecs_sg.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "RDS_SG"
}
}
Configured Security Group with the least configuration to Minimize security group rules to reduce attack surface.
In ./output.tf
section:
output "rds_endpoint" {
value = aws_db_instance.rds_postgresql.endpoint
}
output "ecr_repo_url" {
value = module.computes.ecr_repo_url
}
Used the outputs to pass the important URLs and environment through Pipeline/Workflow. (It will come later in Workflow Using Github Action Section.)
In ./terrafirn-prod.tfvars
or ./terrafirn-dev.tfvars
(whatever stage you will use) Section:
# Network CIDR blocks for the production environment
environment = "prod"
cidr_block = [
{
name = "vpc"
cidr = "10.0.0.0/16"
}
]
az = ["us-east-1a", "us-east-1b"]
container_port = 80
cpu = 1024
memory = 2048
db_master_password = "hello_pass"
the Values of the Variable in ./variable.tf
section.
Workflow Using Github Action
Workflow File Structure:
├── .github/workflows # Github Action Workflow
In ./github/workflows/workflow.yml
section:
name: Build & Deploy Django to ECS
on:
workflow_dispatch:
inputs:
action:
description: "Terraform Action"
required: true
default: "apply"
type: choice
options:
- apply
- destroy
approve:
description: "Approve this action? (approve/dont)"
required: true
default: "dont"
type: choice
options:
- approve
- dont
It starts with a workflow_dispatch
, meaning the pipeline is only triggered manually. The person triggering it must choose two inputs: action
(apply or destroy) and approve
(approve or dont). This double confirmation prevents accidental deployments or infrastructure destruction. It acts as a safety lock to avoid surprises in production.
env:
AWS_REGION: ${{ secrets.AWS_REGION }}
IMAGE_NAME: my-python-app
IMAGE_TAG: latest
In env
start to Pass necessary inputs for the workflow.
Under Jobs:
we will find infrastructure
, build
, push
, destroy
, and exit
In infrastructure
:
infrastructure:
runs-on: ubuntu-latest
if: ${{ github.event.inputs.action == 'apply' && github.event.inputs.approve == 'approve' }}
defaults:
run:
working-directory: ./infrastructure
steps:
- name: Checkout repo
uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Export image name to Terraform
run: echo "TF_VAR_image_name=${{ env.IMAGE_NAME }}" >> $GITHUB_ENV
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Terraform Init
run: terraform init
- name: Terraform Plan
run: |
terraform plan -out=tfplan \
-var-file="terraform-prod.tfvars" \
-var="image_name=${{ env.IMAGE_NAME }}"
- name: Terraform Apply
run: |
terraform apply -auto-approve \
-var-file="terraform-prod.tfvars" \
-var="image_name=${{ env.IMAGE_NAME }}"
This job runs only when both action
is set to apply
and approve
is set to approve
. Inside, it initializes Terraform and deploys the AWS infrastructure needed for your Django app. That usually means ECS Cluster, Application Load Balancer, Security Groups, RDS databases, and ECR repositories. Terraform reads the Docker image name from the environment, but the actual image does not exist yet at this point the pipeline is just setting up the infrastructure shell. The Terraform apply uses terraform-prod.tfvars
, which likely contains production configuration like instance sizes, DB passwords (hopefully through variables or secrets), and VPC IDs. This separation allows infrastructure to be provisioned first, independently of the Docker build process.
build:
if: ${{ github.event.inputs.action == 'apply' && github.event.inputs.approve == 'approve' }}
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./app
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Build Docker image
run: |
docker build -t ${{ env.IMAGE_NAME }} .
docker save ${{ env.IMAGE_NAME }} -o image.tar
- name: Upload Docker image artifact
uses: actions/upload-artifact@v4
with:
name: docker-image
path: ./app/image.tar
compression-level: 9
Next comes the build
job, which also runs only if action
is apply
and approve
is approve
. Its purpose is to build the Docker image for the Django application. It uses docker build
to create the image locally and saves it as image.tar
. Instead of pushing the image right away, the pipeline uploads it as an artifact using GitHub's actions/upload-artifact
. This allows the push
job to download and reuse the same image later, ensuring consistency between build and deployment. It also avoids rebuilding the same image multiple times if other steps fail, which is good for debugging and repeatability, but using docker save/load
is slower compared to building and pushing directly in a single step.
Separated the build
and push
because push
needs infrastructure
to be initalized so it pushs the Image to it, so i do it to speed up the workflow(but indirectly) by makes the build
and infrastructure
to run in parallel.
push:
runs-on: ubuntu-latest
needs: [infrastructure, build]
defaults:
run:
working-directory: ./app
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to ECR
uses: aws-actions/amazon-ecr-login@v1
env:
AWS_ECR_LOGIN_MASK_PASSWORD: true
- name: Download image artifact
uses: actions/download-artifact@v4
with:
name: docker-image
path: ./app
- name: Terraform Init
working-directory: ./infrastructure
run: terraform init
- name: Get ECR Repo from Terraform
working-directory: ./infrastructure
id: get-ecr
run: |
ecr_repo_url=$(terraform output -raw ecr_repo_url)
echo "ecr_repo_url=$ecr_repo_url" >> $GITHUB_ENV
- name: Load and Tag Docker image
run: |
docker load -i image.tar
docker tag ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }} $ecr_repo_url:${{ env.IMAGE_TAG }}
- name: Push Docker image
run: |
docker push $ecr_repo_url:${{ env.IMAGE_TAG }}
After building the image, the push
job starts. This job depends on both infrastructure
and build
jobs completing successfully. It first downloads the previously built image.tar
from GitHub's artifact storage. Then it uses Terraform outputs to get the dynamically created ECR repository URL. This is important because the infrastructure layer controls where the image is supposed to go, and the workflow doesn't hardcode the ECR URL. After that, it loads the Docker image from image.tar
, tags it with the ECR repo URL, and pushes it to AWS ECR. At this point, the ECS service can pull the image directly from ECR in future deploys. This separation between build and push makes the workflow more flexible, but also introduces a risk: if Terraform outputs are wrong or missing (for example, ecr_repo_url
is not properly set), the push will fail even if the image build succeeded.
destroy:
runs-on: ubuntu-latest
if: ${{ github.event.inputs.action == 'destroy' && github.event.inputs.approve == 'approve' }}
defaults:
run:
working-directory: ./infrastructure
steps:
- name: Checkout repo
uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Terraform Init
run: terraform init
- name: Terraform Destroy Plan
run: |
terraform plan -destroy -out=destroy.plan -var-file="terraform-prod.tfvars" \
-var-file="terraform-prod.tfvars" \
-var="image_name=${{ env.IMAGE_NAME }}"
- name: Terraform Destroy
run: terraform apply -auto-approve destroy.plan
The destroy
job handles the infrastructure teardown. It only runs if action
is destroy
and approve
is approve
. This job checks out the repo, configures AWS credentials, initializes Terraform, creates a destroy plan, and then applies it. This process destroys everything: ECS services, ALB, RDS, ECR repos, and any other AWS resources managed by Terraform. The use of terraform plan -destroy
ensures the operator can preview the destruction before applying if needed, but here the plan is auto-applied in one workflow run after approval. This is efficient but risky if not carefully monitored because resources are deleted immediately after confirmation.
exit:
runs-on: ubuntu-latest
if: ${{ github.event.inputs.approve == 'dont' }}
steps:
- name: Abort
run: echo "Action denied by reviewer."
Finally, the exit
job handles the case where someone triggers the pipeline but selects approve
as dont
. Instead of failing silently, this job runs a simple echo "Action denied by reviewer."
to make it clear that the workflow was intentionally aborted by human decision. This improves transparency in CI/CD logs, so that others reviewing the workflow understand it wasn’t a failure—it was a conscious choice not to proceed.
In the bigger picture, this pipeline is designed for safe, manual deployments rather than continuous integration or fast delivery. It prioritizes control over speed. All jobs are isolated: Terraform infra setup is separated from the Docker build and ECR push to reduce coupling and improve troubleshooting. However, there are tradeoffs. Docker artifacts are saved and loaded across jobs, which slows down the process compared to direct ECR pushes. There's no tagging strategy beyond latest
, so production deployments might accidentally overwrite images. Also, there’s no rollback if something fails after infrastructure is created but before the image is pushed. This pipeline is good for environments where you want to prevent mistakes more than you want speed, but in mature CI/CD pipelines, you might eventually automate some parts while keeping approval only for destructive actions like destroy
.
Final Thoughts
Security is not free. But breaches cost more.
This setup prioritizes security and high availability, even if that means paying for:
VPC endpoints
Multi-AZ RDS
Load Balancing
If you’re building something serious—not just a hobby project—this trade-off is justified.
Code Repository
Contributors
Discussion
Would you choose lower cost with higher risk, or pay for security and redundancy upfront?
Let us know your thoughts!
Last updated