Pulling infrastructure data from Terraform with shell script provisioners

Hello,

I thought I’d share a pattern I’ve been working on this week to automate setting up our Harness environments and infrastructure definitions by using a shell script provisioner.

While Harness has support for provisioning via Terraform directly (see Harness :heart: Terraform), I was looking for a way to get outputs (like AWS IAM role IDs, load balancer names etc) from our existing Terraform-provisioned infrastructure into Harness rather than have it run Terraform for a few reasons:

  1. We use a wrapper called Terragrunt around Terraform which isn’t supported.
  2. Our production infra already exists and doesn’t need to be provisioned on the fly, so we don’t need to run all of Terraform.
  3. Harness needs fewer permissions to manage the infrastructure.

We had been using the “Already Provisioned Infrastructure” radio box on our infrastructure definitions and setting them manually for each service, which was pretty laborious and prone to error:

Instead, we looked at using a shell script infrastructure provisioner to fetch information from the last Terraform run rather than trying to run Terraform from Harness. The same pattern might work for other infra tools that you don’t want to, or can’t run.

First in Terraform we started writing values into AWS Systems Manager’s parameter store, which lets us store a tree structure of configuration values. Each resource we created in Terraform would have the relevant IDs and/or names written into parameters:

This is quite simple from Terraform and works much like defined outputs. Here’s a snippet from a module that’s creating some of the infrastructure used for this service:

resource "aws_ssm_parameter" "string_outputs" {
  for_each = {
    "ecs_cluster_arn"              = var.ecs_cluster_arn
    "ecs_cluster_name"             = split("/", data.aws_arn.ecs_cluster.resource)[1]
    "region"                       = data.aws_arn.ecs_cluster.region
    "security_group_id"            = aws_security_group.fargate.id
    "task_execution_iam_role_id"   = aws_iam_role.fargate_task_execution_role.arn
    "task_execution_iam_role_name" = aws_iam_role.fargate_task_execution_role.name
    "task_iam_role_id"             = aws_iam_role.fargate_task_role.arn
    "task_iam_role_name"           = aws_iam_role.fargate_task_role.name
    "vpc_id"                       = var.vpc_id
  }

  name        = "/terraform/${var.service_name}/${each.key}"
  description = "Terraform output value for service ${var.service_name}"

  type  = "String"
  value = each.value
}

Any time Terraform is run, these parameters are kept in sync with any infrastructure changes.

In Harness, we then have a shell script provisioner that reads back the values from the parameter store. Here’s the script body:

#!/bin/bash

# This script is intended for use in Harness as a shell script provisioner.
# The script expects the below variables to be injected by Harness.

# Name of the service under /terraform in SSM parameter store
#export SERVICE_NAME="widget-stage-worker"

# File to output infrastructure data in JSON format, supplied by Harness automatically
# If unset, will use a temporary file and output to STDOUT anyway
#export PROVISIONER_OUTPUT_PATH=/tmp/output.json

set -euo pipefail

TMP=$(mktemp -d)
export TMP
trap "rm -rf $TMP" EXIT

cat > "$TMP/process.jq" << EOF
# Convert parameter name "/terraform/widget/foo" to "foo"
def name_to_key: split("/")[-1];

# Cast SSM parameter strings to native JSON types
def present_value:
  if .Type == "StringList" then
    .Value | split(",")
  else
    .Value
  end;

# Convert each SSM parameter into a simple JSON hash/dict
def param_to_hash:
  {(.Name | name_to_key): (. | present_value)};

# Merge all parameter hashes
.Parameters | map(param_to_hash) | add
EOF

[ -n "${PROVISIONER_OUTPUT_PATH}" ] || PROVISIONER_OUTPUT_PATH=$TMP/output

# Print and convert parameters from /terraform/widget/foo = "bar" into one large
# JSON hash of parameters, e.g. {"foo": "bar"}
aws ssm get-parameters-by-path \
  --path "/terraform/${SERVICE_NAME}" \
  --recursive \
  | jq -f $TMP/process.jq \
  | tee $PROVISIONER_OUTPUT_PATH

The script relies on having awscli and jq utilities installed on the delegate host, which it uses to retrieve the values from parameter store and then reformat into a simple JSON object for Harness to read back in. There’s more information on setting up shell script provisioners in the docs here: Shell Script Provisioner - Harness.io Docs

Our infrastructure definitions are much simpler now and can be shared between services as the values will be looked up dynamically:

Now with the shell script provisioner added to the workflow, we can see the values are fetched by the script and returned to Harness for the setup of the service:

image

Last tip - as well as using the data for infrastructure definitions, you can also use ${shellScriptProvisioner.your_variable} to get other data from the provisioner output in your workflow steps.

Hope that this comes in useful for somebody!

Dominic

2 Likes

This is great! Thanks @domclealfa for the post!

1 Like