Unstick MRs with non-null merge_jids
What does this MR do and why?
Unstick locked MRs with non-null merge_jids.
We query the locked MRs without merge JID and attempt to unstick them if:
- there's no merge exclusive lease
- and they're not in a merge train
We then mark them as merged if:
- its
merged_commit_sha
is set in the DB. - its
merge_commit_sha
is set. - it has no diffs when compared with target branch.
If none of those conditions apply to a locked MR, we unlock the MR instead as we can't assume they are merged.
This also include changes to allow us to query locked MRs without requiring a new index as long as the MR ID is in the redis set.
On merge, we attempt to add MR ID to a set in redis. This set will be used to query MR records that are possibly stuck in StuckMergeJobsWorker
. We remove the MR ID from the set if:
- MR is successfully unlocked (expected merge failure, MR successfully merged)
- MR is unstuck by
StuckMergeJobsWorker
successfully
If MR gets processed by StuckMergeJobsWorker
but failed to unlock it due to validation error, it won't get removed from the redis set.
Related to gitlab-com/gl-infra/production#17921 (closed) and https://gitlab.com/gitlab-org/gitlab/-/issues/467377.
This is behind the following feature flags:
unstick_locked_merge_requests_redis
unstick_locked_mrs_without_merge_jid
Next iteration
As a separate issue, we can add the capability to add old locked MRs to the set in redis so the StuckMergeJobsWorker
can pick them up.
Rollout plan
- Enable
unstick_locked_merge_requests_redis
and keep it enabled for a week. See if we get any reports and watch sentry errors and performance ofMergeService
/MergeWorker
/StuckMergeJobsWorker
and redis. - Once good (no reported issues), enable it by default and get it released to self-managed instances.
- Enable
unstick_locked_mrs_without_merge_jid
on some GitLab projects and keep it enabled for a week. See if we get any reports and watch sentry errors and performance ofStuckMergeJobsWorker
. - Once good (no reported issues), roll
unstick_locked_mrs_without_merge_jid
out incrementally and globally. - Once good (no reported issues), enable it by default and get it released to self-managed instances.
- After a couple of milestones (I'm thinking 2 milestones), we can remove the FFs.