Canary Workflow Recommendations for Terraform, AMI & Kubernetes

Introduction

Canary Deployment is Harness’ key deployment methodology. A Canary Deployment basically reduce the scope, impact, and risk of deploying new software artifacts to production. Instead of deploying a new artifact across all production nodes at the same time, you deploy to a subset of those nodes in phases, with each phase verified to ensure artifacts and your application are behaving appropriately. The term “Canary” came from the old mining concept of the Canary mining technique.

An Example of a 3 phase Canary Deployment

Kubernetes

In this methodology, we allocate 10% of user traffic to the V2 Service. This allows developers time to view the behavior of their new feature under certain load conditions. It also allows engineering teams to target subsets of users that meet certain criteria or demographics for a specific feature. As the developer team gains more confidence, we can progress to phase 2 and increase the number of v2 services to 50% and route 50% of the traffic to V2 services. As a result, more users can test the functionality, and the dev team can review and verify the performance through their APM tools like AppDynamics, Splunk, Datadog, etc. In Phase 3, we are now 100% confident of a V2 Service and us rollout 100% of it and let all our customers leverage it.

The beauty of Canary is that unlike Blue Green, we don’t need a separate environment or maintain another set of infrastructure. We can deploy in our existing infrastructure and route a percentage of traffic to the newly deployed version.

Canary & Harness

Canary deployments can be used to deploy various services and infrastructure. It allows users to test out their service or infrastructure for a brief period of time before tearing it down or promoting it to the next stage.

Kubernetes

Setup

  • Kubernetes Service
  • Canary Workflow
    • 1 Canary Phase (minimum)
    • 1 Primary Phase (minimum)

Best Practice

  • Deploy Canary
    • Run verification on the 1 instance of canary
    • Simulate synthetic traffic to it
  • Scale Canary to a desired number
    • Run verification
    • Send more synthetic traffic to test load at a more realistic scale
    • Teardown Canary Deployment
  • Rollout New Version of Artifact

Example Workflow

Example Deployment

10% Deployment of the Canary Version

Scale up the Deployment to 50%

Full Rollout of new Version

Infrastructure Provisioning

The idea behind Infrastructure Provisioning working best in a Canary Workflow stems from the concept of Ephemeral Environments. You want to quickly spin up infrastructure, deploy an application onto it, run tests and verification, tear down the ephemeral infrastructure. That entire process sounds like a Canary Deployment.

Best Practices

  • Provision Terraform Infrastructure
  • Map Outputs to the Environment fields in Harness
  • Deploy Application or service onto Infra
  • Run Verification on the service
  • Destroy the environment with Terraform Destroy

Sample Canary Terraform Provisioning Deployment

AMI

When we do Canary Deployments with AMI’s we are first deploying one instance of the AMI, and running verification tests on it to make sure it’s healthy. After running the tests, we scale the AMI instance count up to the desired value.

Best Practice

  • Deploy 1 Instance of AMI for Canary
  • Run Verification on Canary
  • Rollout Desired Number for the new version of AMI

Under the hood

Deploy the Canary and run some sort of load or health checks on it.

Once user is comfortable with the canary, we will scale out the AMIs