Fix race condition on merge train with better exclusive lock
What does this MR do?
During the merge train is being processed, the process is wrapped by exclusive lock. This exclusive lock is effective on the entire merge train, however, this could prevent from some MRs (especially earlier ones) to be processed properly and it could result in unmerged merge requests.
We fix this problem by making the exclusive lock more granular. Each merge request is wrapped by exclusive lock instead of entire merge train. This allows the system to let multiple processes to work on a single train, but still prevents concurrent process at the same merge request to avoid unexpected behavior.
Close https://gitlab.com/gitlab-org/gitlab-ee/issues/12423
Problem to solve
Currently, train processor locks the thread until it's finished refreshing all the following merge requests in a train.
This is safe, however, could cause a race condition in the following case:
RefreshMergeRequestsService: PID-1
Case: MR3 pipeline finished
v
| MR1 | MR2 | MR3 | MR4 | MR5 |
RefreshMergeRequestsService: PID-2
Case: MR1 pipeline finished
v <= should be processed
| MR1 | MR2 | MR3 | MR4 | MR5 |
We should process MR1 in PID-2 in the above case, otherwise, MR1 won't be merged.
Approach
Based on https://gitlab.slack.com/archives/D5RJFGJSD/p1561990650021300 by @ayufan
def execute(merge_request)
Redis.push_to_queue(queue_key, merge_request.id)
process_queue
end
def process_queue
in_lock(lock_key) do
process_all
end
# if we have anything added new to the queue
# force refresh of the queue asynchronously
# it will only be executed after in_lock succeeds,
# otherwise it should raise exception
RefreshMergeRequestTrainWorker.perform_async in anything_in_queue?
end
def process_all
merge_request_ids = Redis.pop_all_from_queue(queue_key)
merge_request_id = merge_request_ids.find_the_first_in_train
proces_all_following(merge_request_id)
end
def anything_in_queue?
Redis.len(queue_key) > 0
end
def queue_key
'merge_train:project_id:target_branch'
end
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation created/updated or follow-up review issue created
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Performance and testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team