As part of this article, We will answer this question on How CV works vs How Service Health works for Prometheus.
Problem Statement :
We added a Prometheus Verify step, and the query being used is the following:
`sum(rate(container_cpu_usage_seconds_total{container="app-demo", namespace="Harnesscanary-dev"}[5m]))* 100`
* This successfully returns back a value in Prometheus and on the Harness Health Service page.
* However, when the Verify step runs in our Pipeline, it's returning back no data according to the "External API Calls" logs in Harness, and as a result, the step doesn't create a chart/graph.
* If you take a look at the request being sent from the Verify step, it appears the URL isn't including the full query:
https://prometheus-Harnesscanary.d4vhh.dev.cus.kpsazc.dgtl.harness.com/api/v1/label/Node/values?start=1668115860&end=1668116160&match[]={container=%22app-demo%22,%20namespace=%22kpscicdcanary-dev%22}
-
This is working as expected. Let us explain how CV works vs how Service Health works (for Prometheus).
-
For service health (and SLI), we use the exact same query, provided by the user, to fetch the metric data and calculate the health score.
-
For CV, we need to analyse the metrics per service instance to analyse the difference pre and post-deployment. For that, we use the service instance identifier.
-
Using that, we find all the label values with that identifier and fetch metrics for all the returned values.
What you are seeing is the first API call to fetch all the label values, and since there are no values being returned, we are not making any metric query calls.
You can find more information on verifying the Deployment with Prometheus on the Harness pipeline here: Verify Deployment with Prometheus - Harness.io Docs.