Day 13 - Terraform Data Sources

Today’s learning felt like a shift from “building everything” to working intelligently with what already exists.

Until now, most of my Terraform work was about creating infrastructure. But in real-world cloud environments, things are rarely that simple. Networks, security layers, and shared resources are often already in place, managed by different teams.

This is where Terraform data sources come in.


What Are Terraform Data Sources

Terraform data sources allow us to read existing infrastructure instead of creating it.

A simple way to think about it:

  • Resources → Create and manage infrastructure
  • Data Sources → Read and reference existing infrastructure

This distinction is small in syntax, but huge in real-world usage.


Scenario for This Demo

In this lab, I simulated a real-world setup:

  • A shared VPC already exists
  • A shared subnet already exists
  • My job is to launch an EC2 instance inside that network

The key rule:

  • I should NOT recreate the VPC or subnet
  • I should only reference them using data sources


Architecture Diagram

Terraform uses data sources to reference existing infrastructure and create new resources without modifying the original network. The VPC and subnet are created separately (setup phase). Terraform then reads those resources using data sources and launches an EC2 instance inside them.



Step 1: Create Shared Infrastructure

I first created a VPC and subnet using a separate Terraform setup.

These resources simulate infrastructure created by another team.

Example:

resource "aws_vpc" "shared_vpc" {
cidr_block = "10.13.0.0/16"

tags = {
Name = "shared-network-vpc"
}
}
resource "aws_subnet" "shared_subnet" {
vpc_id = aws_vpc.shared_vpc.id
cidr_block = "10.13.1.0/24"

tags = {
Name = "shared-primary-subnet"
}
}
  • Screenshot of setup/main.tf
  • Screenshot of terraform apply (showing VPC and subnet created)



Step 2: Use Data Sources to Read Existing Infrastructure

Instead of recreating the VPC, I used this:

data "aws_vpc" "shared" {
filter {
name = "tag:Name"
values = ["shared-network-vpc"]
}
}

For subnet:

data "aws_subnet" "shared" {
filter {
name = "tag:Name"
values = ["shared-primary-subnet"]
}

filter {
name = "vpc-id"
values = [data.aws_vpc.shared.id]
}
}
  • Screenshot of data source blocks in main.tf



Step 3: Dynamically Fetch Latest AMI

Instead of hardcoding AMI IDs, I used:

data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
}

This ensures the instance always uses the latest Amazon Linux 2 image.

  • Screenshot showing AMI data source



Step 4: Launch EC2 Using Data Sources

Now the magic happens.

resource "aws_instance" "day13_instance" {
ami = data.aws_ami.amazon_linux_2.id
instance_type = "t2.micro"
subnet_id = data.aws_subnet.shared.id
}

This connects everything:

  • AMI → from AWS
  • Subnet → from existing infra
  • EC2 → created by Terraform
  • Screenshot of EC2 resource block


Terraform Plan Output

When I ran terraform plan, Terraform showed:

  • 1 resource to create → EC2 instance
  • No changes to VPC or subnet

This confirmed that data sources were working correctly.

  • Screenshot of terraform plan



AWS Console Verification

After applying:

  • EC2 instance was created
  • It was placed inside the shared VPC and subnet
  • EC2 instance overview

  • Networking tab showing VPC and subnet



Security Configuration (Important Learning)

I restricted SSH access using my public IP:

cidr_blocks = ["12.xx.xxx.xxx/32"]

This ensures only my machine can connect to the instance.

A quick understanding:

  • /32 → only one IP (secure)
  • 0.0.0.0/0 → entire internet (not safe for production)

Resource vs Data Source (Simple Comparison)

Key Learning

The biggest takeaway for me:

Terraform is not just a tool to create infrastructure.
It is also a tool to connect existing systems safely and intelligently.

Data sources allow teams to work independently while still using shared infrastructure.


Final Thoughts

Today’s lesson felt closer to real cloud architecture.

In real environments:

  • Networks are shared
  • Teams are separate
  • Ownership is distributed

Terraform data sources make it possible to work in this kind of environment without breaking boundaries.

A simple idea, but powerful:

Resources build. Data sources connect.

Video reference



Jay

Comments

Popular posts from this blog

ASM Integrity check failed with PRCT-1225 and PRCT-1011 errors while creating database using DBCA on Exadata 3 node RAC

Lock Tables in MariaDB

Life is beautiful