Clear SafeRequestStore often when exporting
What does this MR do and why?
This MR attempts to resolve memory consumption issues that have been reported in #420623 (closed)
It:
- Updates
StreamingSerializer
to have exported records to be appended to file on disk in batches, instead of having them all exported into 1 enumerator and then written - Because we now write to file in batches, we can clear
SafeRequestStore
often, which lowers the overall memory footprint during export of big projects/groups
This is most evident when exporting projects with a lot of issues/mrs that have a lot of system notes in them. Because we recently added note export authorization checks (only export system notes that the exporting user has access to) it spiked up memory consumption of the export, resulting in sidekiq worker & node in running out of RAM when performing export. This is due to expensive authorization checks that each system note has to go through & us caching a lot of data in memory in SafeRequestStore
, resulting in memory bloat. Clearing the store after each batch of 100 records keeps memory consumption to what it was before, as seen below:
Exporting a project with 5000 issues, each issue having 10 regular and 2 system notes:
> /usr/bin/time -l rails runner script.rb
# script.rb
::Gitlab::SafeRequestStore.ensure_request_store do
ProjectExportWorker.new.perform(1, 20)
end
Clearing request store, 900Mb MAX RSS (peak memory usage of the process):
87.72 real 72.30 user 7.29 sys
939163648 maximum resident set size
Not-clearing request store, 1.8Gb MAX RSS:
93.32 real 70.62 user 14.79 sys
1894924288 maximum resident set size
This change brings down MAX RSS to the figures when we didn't peform auth checks:
31.20 real 23.79 user 6.18 sys
902938624 maximum resident set size
Clear SafeRequestStore often when exporting
- Add SafeRequestStore.clear! when exporting projects/groups in order to keep memory usage at bay
- Update StreamingSerializer to append records to file in batches instead of having all records in a single enumerator
Changelog: fixed
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Create a project with a lot of issues (e.g. 5k like I did in the example above)
- Create a system note in each of the issues (e.g. each issue mentioned on a commit)
- Perform export of the project and measure peak memory (e.g. like I did in the example above)
# script.rb
::Gitlab::SafeRequestStore.ensure_request_store do # we need to wrap in this block because that's how sidekiq jobs are executed (see lib/gitlab/sidekiq_middleware/request_store_middleware.rb:7)
ProjectExportWorker.new.perform(1, 20)
end
/usr/bin/time -l rails runner script.rb
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.