Upload diagnostic reports to GCS: interface (!97155) · Merge requests · GitLab.org / GitLab

Aleksei Lipniagov requested to merge 372242-upload-diagnostic-reports-to-gcs into master Sep 06, 2022

What does this MR do and why?

This was originally an all-in-one uploader implementation, but we decided to split it.
This MR will establish an interface and all the necessary logic to kick off the uploader, but will not include the uploader itself.
Instead, it will be logging, operating in a "read-only" fashion.
The actual uploader part will be done in !97923 (closed)

Context

We need more data, especially from production instances, to analyze performance and memory problems in GitLab. This includes collecting jemalloc statistics or obtaining Ruby heap dumps from web server workers.

We needed to rely on SRE support to trigger and obtain these reports, which was not efficient and convenient.
We decided to build the ability to trigger these reports in the application. This is implemented in !91283 (merged). We started with Jemalloc reports. Currently, this is the only report we run, but we plan to add Ruby heap dumps reports (issue) soon.

The next step is to make these reports available to engineers.
We decided to build an automatic uploader that should upload them to the dedicated GCS bucket.
This bucket will be accessible to the GitLab team.

Implementation approach

There was a long discussion regarding how this should be implemented. We evaluated multiple approaches.
We came with multiple PoC: using fog to upload, curl upload, evaluated dedicated process but decided not to go with it just yet (more in the discussion).

Our primary requirements were listed in: !96045 (closed)

We decide to go with:

Puma worker will spawn a thread with BackgroundTask
This task will check the target directory if there are any reports to upload
The uploading will be done either via 1) shelling out to curl via popen to request GCS 2) shelling out to Ruby script which will use net/http to reach GCS

Notes

Self-managed is not affected because you need to set GITLAB_DIAGNOSTIC_REPORTS_ENABLED. Currently, we are not working on enabling this for self-managed.

Local dev env is not affected unless you set GITLAB_DIAGNOSTIC_REPORTS_ENABLED to start with. Also, you need FF enabled.

⛳ FF rollout issue: #372771 (closed)

How to set up and validate locally

Set necessary ENV vars:

export GITLAB_DIAGNOSTIC_REPORTS_UPLOADER_SLEEP_S=3
export GITLAB_DIAGNOSTIC_REPORTS_ENABLED='true'
export GITLAB_DIAGNOSTIC_REPORTS_PATH='/Users/al/dev/tmp-diag-reports-uploads'

Replace the listed dir with the dir of your choice

gdk restart to pick up ENV vars
Enable FFs in rails c: Feature.enable(:gitlab_diagnostic_reports_uploader)
tail -f log/application_json.log should print the files we expect to upload when the uploader will be implemented.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Related to #372242 (closed)

Edited Sep 15, 2022 by Aleksei Lipniagov

Upload diagnostic reports to GCS: interface