Skip to content

Add runner job image metrics

Gordon Bleux requested to merge UiP9AV6Y/gitlab-runner:runner-image-metrics into main
  • Please check this box if this contribution uses AI-generated content (including content generated by GitLab Duo features) as outlined in the GitLab DCO & CLA. As a benefit of being a GitLab Community Contributor, you can request access to GitLab Duo.

What does this MR do?

Add runner job image metrics

System operators which provide custom images for executors might want to know which images are used and how ofter, either b/c they want to estimate the impact a change to an image could have or which images they no longer need to support b/c noone uses them. The feature is disabled by default b/c the image value for jobs can be an arbitrary value which causes high cardinality for the resulting metrics. It only makes sense to enable the feature if a) the retention policy for the metric is short as to not trash the timeseries database, or b) the number of images is low (e.g. a custom executor with a fixed set of images) and/or controlled (e.g. using an allowlist for the images)

This change introduces a new runner config section to allow expanding on the feature in the future (either by introducing new metrics or by retrofitting the system onto existing metrics)

Why was this MR needed?

We provide a number of different runners with different executors. Some of those use custom images. We currently do not know which images are actively used and require maintenance and which are obsolete. We would like to deprecate some of the images to focus on other tasks. We have workarounds for some executor types (e.g. docker/kubernetes executor), but we would prefer to have a unified solution for all executors (especially our various custom executors)

What's the best way to test this MR?

listen_address = "127.0.0.1:9111"

[[runners]]
  name = "image_metrics"
  url = "https://gitlab.com"
  executor = "docker"

  [runners.docker]
    image = "docker.io/library/ruby:3.3"
    allowed_images = ["docker.io/library/ruby:*", "docker.io/library/python:*", "docker.io/library/php:*"]
    allowed_services = ["docker.io/library/postgres:*", "docker.io/library/redis:*", "docker.io/library/mysql:*"]

  [runners.metrics]
    job_image = true

What are the relevant issue numbers?

n/a

Edited by 🤖 GitLab Bot 🤖

Merge request reports

Loading