Counter metric to track query retries in open transaction
What does this MR do and why?
This adds a metric to the database load balancer to track query retries that may be problematic. We add additional dimensions to distinguish cases (inside_transaction, inside_dirty_transaction
).
Background in #220242 (comment 684439980)
How to set up and validate locally
I've been using the following scripts to play around with this a bit:
The "pathology":
require File.expand_path('../../config/environment', __FILE__)
ActiveRecord::Base.logger = Logger.new(STDOUT)
conn = Gitlab::Database::LoadBalancing.proxy
#conn = ActiveRecord::Base.connection
conn.execute(<<~SQL)
SET idle_in_transaction_session_timeout='1s';
DROP TABLE IF EXISTS foo;
CREATE TABLE foo (id serial primary key, value integer);
SQL
conn.transaction do
conn.insert("insert into foo (value) VALUES (1)")
sleep(2) # transaction times out
# This will run into a PG error, which is not raised.
# Instead, we retry the insert on a fresh connection
# and hence this violates transaction semantics.
conn.insert("insert into foo (value) VALUES (2)")
end
puts conn.select_all('select * from foo').to_a.inspect
See what happens if we restart postgres (severing the connection):
require File.expand_path('../../config/environment', __FILE__)
ActiveRecord::Base.logger = Logger.new(STDOUT)
conn = Gitlab::Database::LoadBalancing.proxy
binding.pry # restart postgres
conn = Gitlab::Database::LoadBalancing.proxy
conn.transaction do
conn.insert("insert into foo (value) VALUES (1)")
end
puts conn.select_all('select * from foo').to_a.inspect
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Andreas Brandl