Resources & Data Sources

Resources are where Terraform’s declarative magic happens. You describe what you want—a database, a load balancer, a network—and Terraform figures out how to make it real. But behind that simple concept lies a sophisticated system for managing dependencies, handling failures, and coordinating complex infrastructure changes.

The difference between writing basic Terraform and writing maintainable, production-ready configurations comes down to understanding how resources relate to each other, when to use different lifecycle rules, and how to handle the edge cases that inevitably arise in real-world infrastructure.

Resource Basics and Lifecycle

Every resource in Terraform follows a lifecycle: Create, Read, Update, Delete (CRUD). But cloud resources are more complex than database records—some can’t be updated in place, others have dependencies that affect the order of operations, and some require special handling during destruction.

resource "aws_instance" "web" {
  ami           = "ami-12345678"
  instance_type = "t3.micro"
  
  # Lifecycle rules control how Terraform handles changes
  lifecycle {
    create_before_destroy = true
    prevent_destroy       = false
    ignore_changes       = [ami]
  }
  
  tags = {
    Name = "web-server"
  }
}

The lifecycle block gives you control over how Terraform manages the resource. create_before_destroy is particularly useful for resources that can’t be updated in place—Terraform creates the new resource before destroying the old one, preventing downtime.

Understanding Resource Dependencies

Terraform automatically detects dependencies when you reference one resource from another:

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  
  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "web" {
  vpc_id     = aws_vpc.main.id  # This creates a dependency
  cidr_block = "10.0.1.0/24"
  
  tags = {
    Name = "web-subnet"
  }
}

resource "aws_instance" "web" {
  ami           = "ami-12345678"
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.web.id  # Another dependency
  
  tags = {
    Name = "web-server"
  }
}

Terraform builds a dependency graph and creates resources in the correct order: VPC first, then subnet, then instance. If you destroy this configuration, it happens in reverse order.

Sometimes you need explicit dependencies for resources that don’t directly reference each other:

resource "aws_instance" "web" {
  ami           = "ami-12345678"
  instance_type = "t3.micro"
  
  # This instance needs the S3 bucket to exist, but doesn't reference it directly
  depends_on = [aws_s3_bucket.app_data]
}

resource "aws_s3_bucket" "app_data" {
  bucket = "my-app-data-bucket"
}

Working with Collections and Count

Real infrastructure often involves multiple similar resources. Terraform provides several ways to handle this:

Count creates multiple instances of a resource:

resource "aws_instance" "web" {
  count         = 3
  ami           = "ami-12345678"
  instance_type = "t3.micro"
  
  tags = {
    Name = "web-${count.index + 1}"
  }
}

# Reference specific instances
output "first_instance_ip" {
  value = aws_instance.web[0].public_ip
}

# Reference all instances
output "all_instance_ips" {
  value = aws_instance.web[*].public_ip
}

For_each is more flexible and works with maps or sets:

variable "instances" {
  type = map(object({
    instance_type = string
    ami          = string
  }))
  
  default = {
    web1 = {
      instance_type = "t3.micro"
      ami          = "ami-12345678"
    }
    web2 = {
      instance_type = "t3.small"
      ami          = "ami-87654321"
    }
  }
}

resource "aws_instance" "web" {
  for_each      = var.instances
  ami           = each.value.ami
  instance_type = each.value.instance_type
  
  tags = {
    Name = each.key
  }
}

The advantage of for_each is that adding or removing items doesn’t affect the other resources—with count, changing the count can cause Terraform to destroy and recreate resources unnecessarily.

Data Sources for External Information

Data sources fetch information about resources that exist outside your Terraform configuration. They’re read-only and are refreshed every time you run Terraform:

# Find the latest Ubuntu AMI
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"] # Canonical
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
  
  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# Get information about availability zones
data "aws_availability_zones" "available" {
  state = "available"
}

# Use data source values
resource "aws_instance" "web" {
  ami               = data.aws_ami.ubuntu.id
  instance_type     = "t3.micro"
  availability_zone = data.aws_availability_zones.available.names[0]
}

Data sources make your configurations more portable and self-updating. Instead of hard-coding AMI IDs that become outdated, you can always use the latest version.

Complex Resource Relationships

Real-world infrastructure involves complex relationships between resources. Here’s an example that shows several patterns:

# VPC and networking
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "public" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index + 1}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
  
  map_public_ip_on_launch = true
  
  tags = {
    Name = "public-subnet-${count.index + 1}"
    Type = "public"
  }
}

# Security group that references the VPC
resource "aws_security_group" "web" {
  name_prefix = "web-"
  vpc_id      = aws_vpc.main.id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  tags = {
    Name = "web-security-group"
  }
}

# Load balancer that depends on subnets and security group
resource "aws_lb" "main" {
  name               = "main-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.web.id]
  subnets           = aws_subnet.public[*].id
  
  tags = {
    Name = "main-load-balancer"
  }
}

This configuration creates a VPC, subnets in multiple availability zones, a security group, and a load balancer. Terraform automatically handles the dependencies and creates everything in the right order.

Resource Provisioners and Local Execution

Sometimes you need to run commands or scripts as part of resource creation. Provisioners handle this, but use them sparingly—they make your infrastructure less predictable:

resource "aws_instance" "web" {
  ami           = "ami-12345678"
  instance_type = "t3.micro"
  key_name      = "my-key-pair"
  
  # Run commands on the remote instance
  provisioner "remote-exec" {
    inline = [
      "sudo apt-get update",
      "sudo apt-get install -y nginx",
      "sudo systemctl start nginx"
    ]
    
    connection {
      type        = "ssh"
      user        = "ubuntu"
      private_key = file("~/.ssh/id_rsa")
      host        = self.public_ip
    }
  }
  
  # Run commands locally
  provisioner "local-exec" {
    command = "echo 'Instance ${self.id} created' >> instances.log"
  }
}

Provisioners run during resource creation and destruction. They’re useful for bootstrapping, but consider using user data scripts or configuration management tools for complex setup.

Handling Resource Failures and Recovery

Sometimes resources fail to create or get into inconsistent states. Terraform provides tools to handle these situations:

# Mark a resource as tainted (will be recreated on next apply)
terraform taint aws_instance.web

# Untaint a resource
terraform untaint aws_instance.web

# Replace a specific resource
terraform apply -replace="aws_instance.web"

# Import existing resources into Terraform state
terraform import aws_instance.web i-1234567890abcdef0

The import command is particularly useful when you have existing infrastructure that you want to manage with Terraform.

Resource Meta-Arguments

Terraform provides several meta-arguments that work with any resource type:

depends_on for explicit dependencies:

resource "aws_instance" "web" {
  # configuration...
  depends_on = [aws_security_group.web]
}

count and for_each for multiple instances:

resource "aws_instance" "web" {
  count = var.instance_count
  # configuration...
}

provider for using alternate provider configurations:

resource "aws_instance" "web" {
  provider = aws.west
  # configuration...
}

lifecycle for controlling resource behavior:

resource "aws_instance" "web" {
  lifecycle {
    create_before_destroy = true
    prevent_destroy      = true
    ignore_changes      = [tags]
  }
}

Working with Sensitive Resources

Some resources contain sensitive information that shouldn’t appear in logs or state files:

resource "aws_db_instance" "main" {
  allocated_storage    = 20
  storage_type         = "gp2"
  engine              = "mysql"
  engine_version      = "8.0"
  instance_class      = "db.t3.micro"
  db_name             = "myapp"
  username            = "admin"
  password            = var.db_password  # Marked as sensitive
  skip_final_snapshot = true
  
  tags = {
    Name = "main-database"
  }
}

# Don't expose sensitive values in outputs
output "database_endpoint" {
  value = aws_db_instance.main.endpoint
}

# Mark sensitive outputs appropriately
output "database_password" {
  value     = aws_db_instance.main.password
  sensitive = true
}

Performance and Optimization

Large Terraform configurations can be slow. Here are some optimization strategies:

Use data sources efficiently: Data sources are refreshed on every run, so minimize expensive queries.

Leverage parallelism: Terraform creates resources in parallel when possible. The -parallelism flag controls how many operations run simultaneously.

Split large configurations: Instead of one massive configuration, use multiple smaller ones with remote state data sources to share information.

Use targeted operations: When debugging, use -target to operate on specific resources:

terraform apply -target="aws_instance.web"

What’s Coming Next

Understanding resources and data sources gives you the building blocks for any infrastructure. You can create complex, interdependent systems that Terraform manages reliably. The patterns you’ve learned—dependencies, collections, and lifecycle management—apply to every cloud provider and resource type.

In the next part, we’ll explore modules—Terraform’s way of creating reusable, composable infrastructure components. Modules let you package common patterns, share them across projects, and build infrastructure libraries that make your team more productive and your infrastructure more consistent.