Add a throttle to sync service db usage (!115581) · Merge requests · GitLab.org / GitLab

Igor Frenkel requested to merge 397670-add-ingestion-throttle into master Mar 22, 2023

What does this MR do and why?

PackageMetadata::SyncService relies on bulk upserts to update package metadata. Without a throttle the service takes up a significant portion of the instance's database cycles. This is especially problematic on resource-constrained instances during the initial package metadata sync which tries to import the entire corpus. This MR adds a simple throttle which reduces the transactions per second which the service sends

Using sleep

The codebase has several throttle-based mechanisms but because this worker is designed to run on its own the currently executing SyncService instance is the only one which has to throttle. Therefore the use of sleep seems sufficient.

750 millisecond sleep

Demonstrating the performance issue (as originally reported) in a reproducible way is quite difficult and something on which I'm still working.

The method I used was to run pgbench at the same time as running the sync service. The baseline transactions per second rate was set for a quiet gitlab database with no queries or workers running. Then this method was repeated against a set of throttle factors to get the performance degradation.

This was the simple query run: pgbench -h $gl_postgresql -T60 -P5 gitlabhq_development

The baseline pgbench numbers:

6408 transactions per second
384512 transactions processed

Additionally, the csv rows per second were also recorded in order to get the sync service throughput (sync csv recs/s in table) for each throttle factor.

throttle factor	transactions per second	% of tps baseline	transactions processed	% of baseline transactions processed	sync csv recs/s	csv rows per second
0.0	5287.14	82.51	317253.00	82.51	2926.26
0.1	5716.34	89.21	343005.00	89.21	2347.16	80.21
0.25	5940.00	92.70	356425.00	92.70	1746.22	59.67
0.5	5962.00	93.04	357769.00	93.04	1230.00	42.03
0.75	6212.65	96.95	372781.00	96.95	945.00	32.29
0.9	6310.00	98.47	378624.00	98.47	829.17	28.34

0.75 was chosen fairly arbitrarily at 1/3 of baseline csv record throughput while being only about 3% off baseline database tps rate.

SyncService throughput was measured by applying this diff and pulling the data from log/application_json.log.

diff --git a/ee/app/services/package_metadata/sync_service.rb b/ee/app/services/package_metadata/sync_service.rb
index 6816c15d95ab..072415d9f67d 100644
--- a/ee/app/services/package_metadata/sync_service.rb
+++ b/ee/app/services/package_metadata/sync_service.rb
@@ -4,7 +4,7 @@ module PackageMetadata
   class SyncService
     UnknownAdapterError = Class.new(StandardError)
     INGEST_SLICE_SIZE = 1000
 
     def self.execute(signal)
       SyncConfiguration.all.each do |config|
@@ -34,14 +34,27 @@ def initialize(connector, version_format, purl_type, signal)
       @signal = signal
     end
 
-    def execute
+    def execute(run_id, throttle_rate)
+      num_inserted = 0
+      start = Time.now
+      log_it = Proc.new do
+        t = Time.now
+        Gitlab::AppJsonLogger.debug(class: self.class.name,  message: "end run",
+                                    start: start, end: t, dur: t-start, run: run_id, throttle_rate: throttle_rate,
+                                    num_inserted: num_inserted, rate: num_inserted/(t-start))
+      end
       connector.data_after(checkpoint).each do |csv_file|
         Gitlab::AppJsonLogger.debug(class: self.class.name,
           message: "Evaluating data for #{purl_type}/#{version_format}/#{csv_file.sequence}/#{csv_file.chunk}")
 
         csv_file.each_slice(INGEST_SLICE_SIZE) do |data_objects|
           ingest(data_objects)
-          sleep(THROTTLE_RATE)
+          num_inserted += INGEST_SLICE_SIZE
+          sleep(throttle_rate)
+          if Time.now - start > 60
+            log_it.call
+            return
+          end
         end
         checkpoint.update(sequence: csv_file.sequence, chunk: csv_file.chunk)

This was run in console (for pypi) using:

svcs = PackageMetadata::SyncConfiguration.all.select { |c| c.purl_type == 'pypi' }
.map do |c|
  conn = PackageMetadata::SyncService.connector_for(c)
  PackageMetadata::SyncService.new(conn, c.version_format, c.purl_type, OpenStruct.new('stop?' => false))
end
svcs.first.execute("run-for-0.1", 0.1)

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Related to #399217 (closed)

Edited Mar 31, 2023 by Igor Frenkel

Add a throttle to sync service db usage

What does this MR do and why?

Using sleep

750 millisecond sleep

How to set up and validate locally

MR acceptance checklist

Merge request reports