Use Case
You are using terraform to deploy AWS instances and EMR clusters and you want to randomly spread them across your subnets.
The Issue
The AWS provider doesn't provide a direct way to say give me a random subnet. You can get a set of subnet ids or you can get a subnet. Neither of which is help ful in distrubting your work load.
The Solution
Use the random_id
resource and some basic modulo math to select a subnet at random. To make sure that we would get a nice distribution across all the subnets I ran a few test and was happy with the results.
Assuming you have 5 subnets the results of 1000 random ids was:
"x=0", occured 208 times for 20%
"x=1", occured 205 times for 20%
"x=2", occured 194 times for 19%
"x=3", occured 192 times for 19%
"x=4", occured 201 times for 20%
You can find my test code and run the numbers yourself in my terraform-tips-and-workarounds github repo. I was running this on Mac Book with a Core i7 processor. If you run the test and don't get an even distrubtion let me know so I can update the test results.
Implementation
I'm going to review the terraform segment by segement. You can access source which is ready to deploy in my tips-tricks-workarounds github repo.
List of subnets
First you need get the list of the subnets. This is done in 2 steps.
- First query for the default VPC. If you are not wanting to use your default VPC then look at the filter and tag options on aws_vpc to dynamically select the vpc.
- Get the subnet ids for the default VPC. If you don't want to use all the subnets you can use teh filter and tag options aws_subnet_ids similar to aws_vpc.
data aws_vpc default {
default = true
}
data aws_subnet_ids current {
vpc_id = data.aws_vpc.default.id
}
AMI
We need an AMI to deploy and EC2 instance. Here I am just query for the latest release of the Amazon 2 Linux AMI.
data aws_ami current {
most_recent = true
filter {
name = "virtualization-type"
values = ["hvm"]
}
# Use Amazon Linux 2 AMI (HVM) SSD Volume Type
name_regex = "^amzn2-ami-hvm-.*x86_64-gp2"
owners = ["137112412989"] # Amazon
}
Radom Time
This is step 1 of the magic. We need to generate a random number. Note you will also need to have a random_id for each instance or EMR you are deploying.
resource random_id index {
byte_length = 2
}
The Math
subnet_ids_list
: We need to convert the subnet ids from a set to a list so we can access with an index.subnet_ids_random_index
: We generate the random index from our random number. The % does modulo calculation. If you are not familar with modulo checkout the wikipedia Modulo operation article.instance_subnet_id
: Using the random index we select a subnet id at random.
And poof, there is your magic in action.
locals {
subnet_ids_list = tolist(data.aws_subnet_ids.current.ids)
subnet_ids_random_index = random_id.index.dec % length(data.aws_subnet_ids.current.ids)
instance_subnet_id = local.subnet_ids_list[local.subnet_ids_random_index]
}
Using It
Now you have a random subnet id you can use in your aws_instance
. Also, note the ignore_changes
to ensure that you don't accidently destroy/create the instance on an future run. This would only occur if a new subnet was added to the VPC.
resource aws_instance instance {
ami = data.aws_ami.current.id
instance_type = "t3.micro"
subnet_id = local.instance_subnet_id
lifecycle {
ignore_changes = [subnet_id]
}
tags = {
Name = "random_subnet_test"
}
}