Deactivate prune webhooklogs worker
What does this MR do?
It deactivates PruneWebhookWorker recurring Sidekiq job. The job runs once per hour and removes old web_hook_logs entries. It was originally introduced in 75316348. @iroussos and the groupdatabase is working on partitioning the web_hook_logs table to make it easier to drop old records. However, PruneWebhookWorker is currently blocking the migration as described in &5558 (comment 542199537).
Our proposal is to disable
PruneWebHookLogsWorker
for both GitLab.com and self hosted instances in %13.11:
Why disable it on Gitlab.com?
The
PruneWebHookLogsWorker
cron job is not able to keep up with the rate new records are added, as it is removing ~2.2M (= 50000 * (168 - 125)
) records per week while we're well beyond 3M new records created per day.Even if we were to address all issues, we would cleanup 1.2M records per day, which is close to 35-40% the rate that new records are added.
We think that it is better to stop cleaning the old records and prune the old partitions once and for all in %14.0 once we are done with the partitioning migration, than keeping the worker around while it is sending queries that time out.
That would mean that we will be having ~50GB of additional records not cleaned per month until June, but that's a small fraction of the current size of
web_hook_logs
and in total less than a month's worth of data (on March we have gone up to 170GB as you can see in my comment above).Why disable it for self hosted instances as well?
We worry that there is a risk that similar locking issues may happen while (large) self hosted instances run post deployment migrations on a no downtime way.
As they are not at the scale of GitLab.com, we have the additional probability that
PruneWebHookLogsWorker
has not fallen behind and that it will directly compete with the backfilling migration for the same sets of records, causing even more lock conflicts.
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done