Report Ruby process USS+PSS into Prometheus
What does this MR do?
Refs #215526
Closes #215864 (closed)
We've been looking more at preferring USS and PSS (unique and proportional set size) to gauge memory use on Linux, since it accounts for shared memory which is important for pre-fork servers where a large chunk of memory remains static and can be shared between workers.
This MR aims at reporting these metrics alongside RSS in our Ruby sampler. The change is behind a feature flag:
-
feature flag
collect_memory_uss_pss
Implementation
In the past we had sampled this data from /proc/self/smaps
, which can be very slow, as it collects data from the kernel for each virtual memory area mapped by the process, and which was then summed up in Ruby space.
Here we instead rely on a relatively new kernel feature that came out of Android, which sums up all relevant VMAs into a new file /proc/self/smaps_rollup
that has a single PSS figure, as well as private memory pages that can be further rolled up into USS, for the entire process:
By using smaps_rollup instead of smaps, a caller can avoid the significant overhead of formatting, reading, and parsing each of a large process's potentially very numerous memory mappings. For sampling system_server's PSS in Android, we measured a 12x speedup, representing a savings of several hundred milliseconds.
https://patchwork.kernel.org/patch/9896795/
i.e.
pss = Pss entry
uss = sum of Private_ page entries
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides - [-] Database guides
- [-] Separation of EE specific content
Availability and Testing
-
Test on review-app
-
Test on MacOS which does not have /proc
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
-
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Additionally, I verified that sampling works locally as follows:
USS:
-
curl -v localhost:3000/-/metrics | egrep '^ruby_process_unique' | grep puma_1
- =>
ruby_process_unique_memory_bytes{pid="puma_1"} 166150144
- =>
-
cat /proc/530/smaps_rollup | grep Private_ | awk '{print $2}' | paste -sd+ | read sum; bc <<< "($sum) * 1024"
- =>
166150144
- =>
166150144 == 166150144
PSS:
-
curl -v localhost:3000/-/metrics | egrep '^ruby_process_proportional' | grep puma_1
- =>
ruby_process_proportional_memory_bytes{pid="puma_1"} 581093376
- =>
-
cat /proc/530/smaps_rollup | egrep '^Pss:' | awk '{print $2}' | read pss; bc <<< "($pss) * 1024"
- =>
581966848
- =>
581093376 ~= 581966848
Note that it is expected not to get the exact same reading at all times here because prometheus only samples every so often.