Add a throttle to sync service db usage
What does this MR do and why?
PackageMetadata::SyncService
relies on bulk upserts to update package metadata. Without a throttle the service takes up a significant portion of the instance's database cycles. This is especially problematic on resource-constrained instances during the initial package metadata sync which tries to import the entire corpus. This MR adds a simple throttle which reduces the transactions per second which the service sends
Using sleep
The codebase has several throttle-based mechanisms but because this worker is designed to run on its own the currently executing SyncService
instance is the only one which has to throttle. Therefore the use of sleep
seems sufficient.
750 millisecond sleep
Demonstrating the performance issue (as originally reported) in a reproducible way is quite difficult and something on which I'm still working.
The method I used was to run pgbench
at the same time as running the sync service. The baseline transactions per second rate was set for a quiet gitlab database with no queries or workers running. Then this method was repeated against a set of throttle factors to get the performance degradation.
This was the simple query run: pgbench -h $gl_postgresql -T60 -P5 gitlabhq_development
The baseline pgbench
numbers:
- 6408 transactions per second
- 384512 transactions processed
Additionally, the csv rows per second were also recorded in order to get the sync service throughput (sync csv recs/s
in table) for each throttle factor.
throttle factor | transactions per second | % of tps baseline | transactions processed | % of baseline transactions processed | sync csv recs/s | csv rows per second |
---|---|---|---|---|---|---|
0.0 | 5287.14 | 82.51 | 317253.00 | 82.51 | 2926.26 | |
0.1 | 5716.34 | 89.21 | 343005.00 | 89.21 | 2347.16 | 80.21 |
0.25 | 5940.00 | 92.70 | 356425.00 | 92.70 | 1746.22 | 59.67 |
0.5 | 5962.00 | 93.04 | 357769.00 | 93.04 | 1230.00 | 42.03 |
0.75 | 6212.65 | 96.95 | 372781.00 | 96.95 | 945.00 | 32.29 |
0.9 | 6310.00 | 98.47 | 378624.00 | 98.47 | 829.17 | 28.34 |
0.75 was chosen fairly arbitrarily at 1/3 of baseline csv record throughput while being only about 3% off baseline database tps rate.
SyncService throughput was measured by applying this diff and pulling the data from log/application_json.log
.
diff --git a/ee/app/services/package_metadata/sync_service.rb b/ee/app/services/package_metadata/sync_service.rb
index 6816c15d95ab..072415d9f67d 100644
--- a/ee/app/services/package_metadata/sync_service.rb
+++ b/ee/app/services/package_metadata/sync_service.rb
@@ -4,7 +4,7 @@ module PackageMetadata
class SyncService
UnknownAdapterError = Class.new(StandardError)
INGEST_SLICE_SIZE = 1000
def self.execute(signal)
SyncConfiguration.all.each do |config|
@@ -34,14 +34,27 @@ def initialize(connector, version_format, purl_type, signal)
@signal = signal
end
- def execute
+ def execute(run_id, throttle_rate)
+ num_inserted = 0
+ start = Time.now
+ log_it = Proc.new do
+ t = Time.now
+ Gitlab::AppJsonLogger.debug(class: self.class.name, message: "end run",
+ start: start, end: t, dur: t-start, run: run_id, throttle_rate: throttle_rate,
+ num_inserted: num_inserted, rate: num_inserted/(t-start))
+ end
connector.data_after(checkpoint).each do |csv_file|
Gitlab::AppJsonLogger.debug(class: self.class.name,
message: "Evaluating data for #{purl_type}/#{version_format}/#{csv_file.sequence}/#{csv_file.chunk}")
csv_file.each_slice(INGEST_SLICE_SIZE) do |data_objects|
ingest(data_objects)
- sleep(THROTTLE_RATE)
+ num_inserted += INGEST_SLICE_SIZE
+ sleep(throttle_rate)
+ if Time.now - start > 60
+ log_it.call
+ return
+ end
end
checkpoint.update(sequence: csv_file.sequence, chunk: csv_file.chunk)
This was run in console (for pypi
) using:
svcs = PackageMetadata::SyncConfiguration.all.select { |c| c.purl_type == 'pypi' }
.map do |c|
conn = PackageMetadata::SyncService.connector_for(c)
PackageMetadata::SyncService.new(conn, c.version_format, c.purl_type, OpenStruct.new('stop?' => false))
end
svcs.first.execute("run-for-0.1", 0.1)
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #399217 (closed)