Skip to content

Increase ProcessBookkeepingService batch to 10_000

Dylan Griffith requested to merge increase-bulk-indexer-batch-sizes into master

What does this MR do?

Since we've learnt through monitoring that batches with 1000 jobs are not taking more than 5.5s on average and the median is around 4s we should be safe to increase this ten times and still process batches in the desired time window.

This started from a conversation at !28511 (comment 315475648) where initially we wanted this to be configurable but really it turns out this number should be approximately how quickly a single core can process this many jobs being marshalled and sent to Elasticsearch and is likely not going to be something that benefits much from configuration. But we'll want to increase it now to 10k which still seems like a safe number and will give us a lot of scaling headroom before we have to figure out how to parallelize this work.

Percentile durations

Screen_Shot_2020-04-30_at_3.53.24_pm

I also tried plotting averages and the 95, 99 percentiles but these make the charts ugly and obviously reveal some large outliers that aren't at all correlated to payload size so I think this doesn't give much concern for increasing the payload size if I'm interpeting the data correctly:

Outlier durations

Screen_Shot_2020-04-30_at_3.57.01_pm

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Yorick Peterse

Merge request reports

Loading