fast forward merge requests: database transaction timeout in update_merge_sha_metadata causes merge request to get stuck
Summary
Customer reported that a merge request had failed to close, although the change did get merged to the target branch master
. This happened on GitLab.com
-
GitLab team members can read more in the ticket
-
Reviewing their MR, we could see the merge commit generated by the merge operation ping back to the system notes on the MR
-
No message
#person merged #time ago
is logged -
The merge request shows as open and with no changes or commits
- this looks to be in part because the developer may have rebased their branch and pushed it
- Additionally the merge request code would see its changes on the target branch
🤷
-
We think a pipeline should have triggered on the merge request, but either that wasn't triggered or Rails flagged the whole pipeline as skipped. There is a skipped pipeline on the MR, but it's for the wrong commit, so we suspect the code didn't get as far as running the post-merge pipeline.
-
No error messages in Gitaly logs; 100% of the events for this correlation ID are
level: info
-
We found a failure in sidekiq
MergeWorker JID-44d9064962bd9fa5d8845868: fail: 16.600004 sec
(stack trace below)json.exception.message: PG::QueryCanceled: ERROR: canceling statement due to statement timeout json.db_duration_s: 15.264 json.db_main_duration_s: 15 json.db_ci_duration_s: 0.003 json.db_primary_duration_s: 15.271
The last call in the stack trace prior to the lib/gitlab/database
calls was in update_merge_sha_metadata
at merge_request.update!(**data_to_update) if data_to_update.present?
This code changed quite recently in: !97871 (merged)
def update_merge_sha_metadata(commit_id)
data_to_update = merge_success_data(commit_id)
data_to_update[:squash_commit_sha] = source if merge_request.squash_on_merge?
merge_request.update!(**data_to_update) if data_to_update.present?
end
def merge_success_data(commit_id)
{ merge_commit_sha: commit_id }
end
Steps to reproduce
Cause the database call in merge_request.update
to fail
Example Project
What is the current bug behavior?
a database error updating the merge request isn't handled and will cause the merge worker to terminate. the MR is then left in the wrong state.
What is the expected correct behavior?
- roll back?
- retry?
- merge request ends in a recoverable state?
Relevant logs and/or screenshots
lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
lib/gitlab/database/load_balancing/connection_proxy.rb:61:in `block (2 levels) in <class:ConnectionProxy>'
lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
lib/gitlab/database/load_balancing/connection_proxy.rb:71:in `transaction'
lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
lib/gitlab/database/load_balancing/load_balancer.rb:115:in `block in read_write'
lib/gitlab/database/load_balancing/load_balancer.rb:191:in `retry_with_backoff'
lib/gitlab/database/load_balancing/load_balancer.rb:111:in `read_write'
lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
lib/gitlab/database/load_balancing/connection_proxy.rb:71:in `transaction'
app/services/merge_requests/merge_service.rb:106:in `update_merge_sha_metadata'
app/services/merge_requests/merge_service.rb:95:in `commit'
app/services/merge_requests/merge_service.rb:33:in `block in execute'
app/models/merge_request.rb:1510:in `in_locked_state'
app/services/merge_requests/merge_service.rb:32:in `execute'
app/services/merge_requests/merge_service.rb:20:in `execute'
app/workers/merge_worker.rb:29:in `perform'
lib/gitlab/database/load_balancing/sidekiq_server_middleware.rb:26:in `call'
lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executed.rb:17:in `perform'
lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:58:in `perform'
lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'
lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'
lib/gitlab/sidekiq_middleware/worker_context/server.rb:19:in `block in call'
lib/gitlab/application_context.rb:113:in `block in use'
lib/gitlab/application_context.rb:113:in `use'
lib/gitlab/application_context.rb:54:in `with_context'
lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `call'
lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'
lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'
lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `block in call'
lib/gitlab/database/query_analyzer.rb:37:in `within'
Output of checks
This happened on GitLab.com - GitLab Enterprise Edition 15.5.0-pre 0a1d315d