Diagnostic reports: compress data
What does this MR do and why?
Refs #370077 (closed)
This adds a gzip
compression step when streaming diagnostic reports to disk, which will dramatically decrease report size both on disk and on wire (we upload these into a GCS bucket every half hour or so).
Performance and efficacy
Our report files are currently all JSON text files i.e. highly compressable. For Object Space ("heap") dumps this is especially effective since a typical heap dump for a hot worker process will be around 1GB of uncompressed data.
I looked at 3 different compression tools: gzip
, bzip2
, and zstd
. I summarized the results below. Data was collected via /usr/bin/time -v
for a 1GB puma heap dump on an 8 core i9@3.8GHz (Carbon X1).
tool | user_s | system_s | peak RSS | size |
---|---|---|---|---|
zstd -1 |
1.07 | 0.20 | 12M | 77M |
gzip -1 |
4.02 | 0.13 | 2M | 103M |
bzip2 -1 |
66.14 | 0.47 | 2.3M | 89M |
zstd
can leverage multiple CPU cores to speed up processing, but can result in higher memory use especially for larger compression levels. With the default level of -3
, it used 3 times as much memory (39MB). It also supports setting a memory cap, but this did not appear to have any effect during testing. I think 12MB is definitely acceptable here.
bzip2
has unacceptable performance characteristics with compression that is only marginally better than gzip.
I chose gzip
by default because it seems to over a decent trade-off between CPU and memory use, and it is already installed in our production CNG images. I will look into swapping this out with zstd
in a follow-up issue.
gzip package
I verified that gzip
is installed in the gitlab-rails
image already, but nonetheless I am making this explicit here: gitlab-org/build/CNG!1218 (merged)
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
How to set up and validate locally
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #370077 (closed)