In this three-part series, I will explain how to use Kubernetes (K8s) and Terraform (TF) together to set up a Kubernetes cluster, manage applications and install Kasten. We will of course keep data management best practices in mind for every step. Installing Kasten in the cluster is also a great example of how Terraform can be used when managing cloud resources outside the cluster.
This series is intended for people who already have a basic understanding of Kubernetes and are questioning how Terraform could be useful in the context of Kubernetes.
In the first part, we discussed the concepts behind Terraform and Kubernetes, their similarities & differences, and how to use the two in harmony. This second part will be a hands-on example for setting up a Kubernetes cluster on AWS EKS with Terraform. And lastly, we will use Terraform to install Kasten and set up an S3 export location. You can also find all the code on GitHub.
Terraform for Cluster Deployment
How you deploy a K8s cluster with Terraform depends on your cloud provider or on-prem setup. For AWS you have two options Kops and EKS (Elastic Kubernetes Service). Kops was created before EKS and is maintained by the Kubernetes Community, EKS is a managed AWS service. EKS is now available in most AWS regions and my go-to recommendations but there are some differences to keep in mind:
Kops supports multiple cloud platforms and can even generate Terraform code for you.
EKS has better encapsulation for etcd and the control plane by default. That increases the security but also means a higher baseline cost.
EKS also has significantly deeper integration with other AWS services.
Setting up an EKS Cluster on AWS
For our example, we’ll use EKS because of its security advantages and slightly easier management. Even if you have no interest in running a cluster on AWS or EKS, the general approach should still be useful in other environments. A basic understanding of Terraform would be useful but I’ll try to explain everything.
To follow along, you will need to make sure that you have:
You can also use the repository with all the files that we will create as references.
Before we can create the cluster we need a little boilerplate. You can put all of the following code into a main.tf file.
# Configure the AWS provider
provider "aws" {
# Make sure to configure the region that makes the most sense for you
region = "eu-central-l"
}
# We access the default VPC (data source)
data "aws_vpc" "default" {
default = true
}
# ...and use the subnets that come with the default VPC
data "aws_subnet_ids" "default" {
# By referencing the block above we can determine the ID of the default VPCs
vpc_id = data.aws_vpc.default.id
}
Furthermore a locals block is useful to define common constants:
locals { name = "example" # Assigning common tags to all resources help you manage them tags = { Project = "Terraform K8s Example Cluster" # Pro tip: Put the URL of your Git Repo in here Terraform = "True" } } (
Now, we can have a look at the EKS cluster resource definition. A minimal configuration has the following required fields:
- name
- role_arn
- vpc_config → subnet_ids
We have the subnet ids and name but still need a IAM role to assign to the cluster so let’s do that next:
resource "aws_iam_role" "cluster" { # Using a prefix ensures a unique name name_prefix = "eks-cluster-${local.name}-" assume_role_policy = jsonencode({ Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "eks.amazonaws.com" } }] Version = "2012-10-17" }) tags = local.tags } # We also need to attach additional policies: resource "aws_iam_role_policy_attachment" "cluster_eks_cluster_policy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy" role = aws_iam_role.cluster.name } # Optionally, enable Security Groups for Pods # Reference: resource "aws_iam_role_policy_attachment" "cluster_eks_vpc_resource_controller" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController" role = aws_iam_role.cluster.name }
Before we continue with creating the cluster itself, you can already create these resources:
- Run terraform init, this will let Terraform install the AWS provider
- Then run terraform apply ; inspect the plan and ensure it matches your expectations before confirming with yes
If you want to stop at any point, you can run terraform destroy to safely and cleanly remove all created resources.
Now we finally got all the ingredients to define the EKS cluster:
resource "aws_eks_cluster" "cluster" { name = local.name role_arn = aws_iam_role.cluster.arn version = "1.18" vpc_config { subnet_ids = data.aws_subnet_ids.default.ids } # Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling. # Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups. depends_on = [ aws_iam_role_policy_attachment.cluster_eks_cluster_policy, aws_iam_role_policy_attachment.cluster_eks_vpc_resource_controller, ] tags = local.tags }
You may be looking for instance configurations (instance type, count, etc) and that is what we will do next using the eks_node_group.
But first we need another IAM role:
resource "aws_iam_role" "nodes" { name_prefix = "eks-nodes-${local.name}-" assume_role_policy = jsonencode({ Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } }] Version = "2012-10-17" }) tags = local.tags } resource "aws_iam_role_policy_attachment" "nodes_eks_worker_node_policy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy" role = aws_iam_role.nodes.name } resource "aws_iam_role_policy_attachment" "nodes_eks_cni_policy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" role = aws_iam_role.nodes.name } resource "aws_iam_role_policy_attachment" "nodes_ec2_container_registry_read_only" { policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" role = aws_iam_role.nodes.name }
Now the nodes:
resource "aws_eks_node_group" "nodes" { cluster_name = aws_eks_cluster.cluster.name node_group_name = "default" node_role_arn = aws_iam_role.nodes.arn subnet_ids = data.aws_subnet_ids.default.ids # We start with a minimal setup scaling_config { desired_size = 3 max_size = 3 min_size = 3 } # I'd recommend t3.large or t3.xlarge for most production workloads instance_types = ["t3.medium"] # Ensure that IAM Role permissions are created before and deleted after EKS Node Group handling. # Otherwise, EKS will not be able to properly delete EC2 Instances and Elastic Network Interfaces. depends_on = [ aws_iam_role_policy_attachment.nodes_eks_worker_node_policy, aws_iam_role_policy_attachment.nodes_eks_cni_policy, aws_iam_role_policy_attachment.nodes_ec2_container_registry_read_only, ] tags = local.tags }
Alternatively to fixed nodes, EKS also allows you to use Fargate which provides isolated environments on-demand for each pod. Fargate is powered by Firecracker, the same technology that powers AWS Lambda and certainly an interesting option but too different from a standard K8s setup for our purposes and also not cheap.
Now we create the EKS cluster and nodes by executing terraform apply again. This may take a while.
Configuring access to the cluster for kubectl is super easy using the AWS CLI (see: connect to EKS cluster):
aws eks update-kubeconfig --name example # Make sure you switch to the new context: kubectl config use-context # Test it out, this should show the nodes we configured: kubectl get nodes
Congratulations, you now have a running Kubernetes cluster managed by Terraform.
Further Optimization
To prepare your cluster for production you may want to:
- Install an ingress Controller
- Create dedicated (and private) subnets for the EKS nodes
- Limit the network traffic between nodes using security groups
- Block public access to the API server endpoint via CIDR IP restrictions or a bastion pattern
- Configure auto-scaling for the node group
Most of these changes can be made right in your Terraform project. For example, if you want to deploy additional nodes to your cluster, you can simply update the node count and apply again.