Telemetry: Export process metrics for all GitLab components
We would like to get a better idea of which GitLab components self-managed customers run and how much memory they consume. This is only possible when having process level memory metrics including:
Minimum:
- RSS (resident set size)
Ideally also:
- PSS (proportional set size)
- USS (unique set size)
We decided that Prometheus is the best place to look for this, which we can then query for this data by component, and submit it back as a usage ping.
As of recently, we track all of these already for Ruby components (web, sidekiq), but not for other non-Ruby components (gitaly, workhorse, etc.)
For some select components, we already run https://github.com/ncabatoff/process-exporter. We should look into providing these metrics for all remaining component processes as well, or, if that is too expensive, at least in aggregate so that we know what the average memory use is for e.g. gitaly (node memory alone is too coarse.)
It was also pointed out that process-exporter
does not appear to export USS, only RSS and PSS. We may want to look into adding that functionality or see if it is derivable from the other metrics it exports.
We cannot readily use process-exporter
because it only runs for gitlab.com
, but not self-managed. If we want to track processes of services that we do not own, we could look into running it as part of every Omnibus installation.
Update July 17 2020
We now track memory and other metrics for the following services:
- web
- sidekiq
- workhorse
- gitaly
- postgres
- redis
- prometheus
- node-exporter
- registry
However, there are many more services that may run as part of larger GitLab deployments (see comments for details) that we do not yet track.
If we need to track ALL GitLab services, the best options we can see are:
- process_exporter: not in Omnibus yet, but in GitLab.com. Previously people think it is too heavy. It is not necessarily the case. We can re-evaluate if we need to track all GitLab services.#218546 (comment 378029612)
- gitlab_exporter: many people do not agree. It is said we want to deprecate gitlab_exporter.
We need a decision from the product which services, if any, we would like to consider additionally that are not yet on this list. Otherwise, we can call this issue done.