Skip to content

feat: Add job name to kubernetes pod labels

  • Please check this box if this contribution uses AI-generated content (including content generated by GitLab Duo features) as outlined in the GitLab DCO & CLA

What does this MR do?

This adds the job name to pod labels.

Why was this MR needed?

While the job name already exists on pod annotations, these annotations cannot be used to filter metrics in GCP. This is important to help rightsize job resources.

Pod labels can be used as a filter in any GCP k8s_container metric (cpu/memory requests/limit utilization) via "user metadata labels".

I suspect that having this label could help for other observability platforms, though I haven't research others.

Some common off-the-shelf tools do not help in this scenario:

Work-around 1 (toil):

  1. Add pod_labels_overwrite_allowed
  2. Manually add job name labels to every job

Workaround 2 (incomplete):

  1. It is possible to use GKE audit logs to create a log-based metric that has both the job name (from the annotation) and pod name as labels
  2. This metric can be joined with promQL on k8s_container metrics, and filtered with job name

This workaround is incomplete because gitlab CI often creates pods large enough that they are not included in the GKE audit logs, and instead audit.k8s.io/truncated, which results in missing job metrics for the most important (large) jobs. I believe these pod objects are so large because gitlab CI inject CI/CD vars directly into the pod spec instead of referencing them from a kubernetes secret.

What's the best way to test this MR?

  • Automated tests
  • Creating a job with the kubernetes executor and verifying that the label exists

What are the relevant issue numbers?

https://support.gitlab.com/hc/en-us/requests/527861

Edited by Charlie Getzen

Merge request reports

Loading