Skip to content

Take lease in UpdateProjectStatisticsWorker

Will Chandler (ex-GitLab) requested to merge wc-update-stats-lease into master

What does this MR do and why?

Describe in detail what your merge request does and why.

Recent changes made to how we calculate repository size has made this a much more expensive operation. Previously we used a naive du -sk call, which can give inconsistent results depending on how well-packed the repository is. Swapping to git rev-list --all --objects --disk-usage gives us much more accurate results, but is orders of magnitude more expensive to run. See confidential issue # 351415 for conversation around this change.

To help mitigate the increased load generated by this change, we want to ensure we are limiting refreshes on project statistics to every 15 minutes. Currently we taken an exclusive lease in ProjectCacheWorker to limit concurrency, but after 15 minutes this will trigger a UpdateProjectStatisticsWorker call that does not take a lease. As a result, a subsequent push 15 minutes later will regenerate statistics at the same time that the delayed job scheduled by the original push is running.

Let's take a lease in UpdateProjectStatisticsWorker as well to ensure that we're not running duplicate jobs.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Push a change to a project
  2. Observe that Gitaly has received a RepositorySize RPC from that request
  3. Wait 16 minutes
  4. Push another change to the same branch, this time a RepositorySize RPC should not be triggered due to the lock taken by UpdateProjectStatisticsWorker from the original push

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Closes #366513 (closed)

Merge request reports

Loading