Skip to content

Optimise merge request diff keep-arounds

James Fargher requested to merge mr_diff_keep_arounds into master

What does this MR do and why?

#486685 (closed)

This MR seeks to reduce the number of keep-arounds generated by merge request diffs:

  • No need to create keep-arounds on merge-base commits as these are already kept by the head commit.
  • No need to create keep-arounds on merge-head diffs as these are temporary and are already kept by refs/merge-requests/<iid>/merge.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

We'll run these steps twice. Once on master to get a baseline and once on this branch. So first checkout master, then:

  1. Create a repo from template (I'm using the rails template).
  2. Create a merge request targeting main with changes (I just delete a file and set a target branch name for ease).
  3. Push a change to main (delete a file and leave the target branch name as master).
  4. Run the keep-around orphan rake task:
    $ bundle exec rake gitlab:keep_around:orphaned FILENAME=baseline_before.csv PROJECT_PATH=root/example
    I, [2024-10-10T10:32:42.318370 #1130156]  INFO -- : Finding keep-around references...
    I, [2024-10-10T10:32:42.340563 #1130156]  INFO -- : Checking pipeline shas...
    I, [2024-10-10T10:32:42.399837 #1130156]  INFO -- : Checking merge request shas...
    I, [2024-10-10T10:32:42.420645 #1130156]  INFO -- : Checking merge request diff shas...
    I, [2024-10-10T10:32:42.448000 #1130156]  INFO -- : Checking note shas...
    W, [2024-10-10T10:32:42.448062 #1130156]  WARN -- : System notes will not be included.
    W, [2024-10-10T10:32:42.465635 #1130156]  WARN -- : Sent notifications will not be included.
    I, [2024-10-10T10:32:42.465672 #1130156]  INFO -- : Checking todo shas...
    I, [2024-10-10T10:32:42.484489 #1130156]  INFO -- : Keep-around orphan report complete
  5. Visit/refresh the merge request page (this triggers the merge-head to be updated).
  6. Run the keep-around orphan rake task again:
    $ bundle exec rake gitlab:keep_around:orphaned FILENAME=baseline_after.csv PROJECT_PATH=root/example
    I, [2024-10-10T10:35:13.235033 #1130649]  INFO -- : Finding keep-around references...
    I, [2024-10-10T10:35:13.256559 #1130649]  INFO -- : Checking pipeline shas...
    I, [2024-10-10T10:35:13.305883 #1130649]  INFO -- : Checking merge request shas...
    I, [2024-10-10T10:35:13.332446 #1130649]  INFO -- : Checking merge request diff shas...
    I, [2024-10-10T10:35:13.367821 #1130649]  INFO -- : Checking note shas...
    W, [2024-10-10T10:35:13.367860 #1130649]  WARN -- : System notes will not be included.
    W, [2024-10-10T10:35:13.386342 #1130649]  WARN -- : Sent notifications will not be included.
    I, [2024-10-10T10:35:13.386381 #1130649]  INFO -- : Checking todo shas...
    I, [2024-10-10T10:35:13.407148 #1130649]  INFO -- : Keep-around orphan report complete
  7. Compare the output of the two orphan reports:
    --- baseline_before.csv	2024-10-10 10:32:42.484035299 +1300
    +++ baseline_after.csv	2024-10-10 10:35:13.405345447 +1300
    @@ -1,4 +1,5 @@
     operation,commit_id
    +keep,27cd9fb30f025d20cd0f0ce47bad120816bcc679
     keep,495f3164dc4770539044d30d953f897beaa9eade
     keep,5293981aeeea286cc1d0acee6e472172845ca543
     keep,afd50ffc325afcff37bf05ecd1327aa2f854d811
    Note that "keep" without a corresponding "usage" means there is a keep-around that the DB no longer references. This is bad. Also note that 27cd9fb30f025d20cd0f0ce47bad120816bcc679 is a generated merge commit.
  8. Now switch to this branch, restart GDK and repeat from step 1.
  9. Now if we compare the output of the final report between baseline and the fix we'll notice that the fix has 40% less keep-arounds even for a trivial case such as this and that after the fix all keep-arounds correspond with a database field.
  10. Additionally we can compare the refs of each project in full:
       [9] pry(main)> baseline.repository.raw.list_refs(['refs/'])
    => [<Gitaly::ListRefsResponse::Reference: name: "refs/heads/master", target: "495f3164dc4770539044d30d953f897beaa9eade", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/heads/some-mr", target: "d02931562285ddaa4657840f32c51e7cab9c0a93", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/27cd9fb30f025d20cd0f0ce47bad120816bcc679", target: "27cd9fb30f025d20cd0f0ce47bad120816bcc679", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/495f3164dc4770539044d30d953f897beaa9eade", target: "495f3164dc4770539044d30d953f897beaa9eade", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/5293981aeeea286cc1d0acee6e472172845ca543", target: "5293981aeeea286cc1d0acee6e472172845ca543", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/afd50ffc325afcff37bf05ecd1327aa2f854d811", target: "afd50ffc325afcff37bf05ecd1327aa2f854d811", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/d02931562285ddaa4657840f32c51e7cab9c0a93", target: "d02931562285ddaa4657840f32c51e7cab9c0a93", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/merge-requests/1/head", target: "d02931562285ddaa4657840f32c51e7cab9c0a93", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/merge-requests/1/merge", target: "27cd9fb30f025d20cd0f0ce47bad120816bcc679", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/pipelines/7", target: "d02931562285ddaa4657840f32c51e7cab9c0a93", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/pipelines/8", target: "495f3164dc4770539044d30d953f897beaa9eade", peeled_target: "">]
    [10] pry(main)> fix.repository.raw.list_refs(['refs/'])
    => [<Gitaly::ListRefsResponse::Reference: name: "refs/heads/delete-file", target: "233f5976dd79438ad83fd7030d7bb1ad7ab005a3", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/heads/master", target: "4023cd8f4c6d5ff06d9a60d5a1fa7208973868bc", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/233f5976dd79438ad83fd7030d7bb1ad7ab005a3", target: "233f5976dd79438ad83fd7030d7bb1ad7ab005a3", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/4023cd8f4c6d5ff06d9a60d5a1fa7208973868bc", target: "4023cd8f4c6d5ff06d9a60d5a1fa7208973868bc", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/keep-around/5293981aeeea286cc1d0acee6e472172845ca543", target: "5293981aeeea286cc1d0acee6e472172845ca543", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/merge-requests/1/head", target: "233f5976dd79438ad83fd7030d7bb1ad7ab005a3", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/merge-requests/1/merge", target: "587b3c601726bd1f3082e2be4bfed3661ef2350f", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/pipelines/10", target: "4023cd8f4c6d5ff06d9a60d5a1fa7208973868bc", peeled_target: "">,
     <Gitaly::ListRefsResponse::Reference: name: "refs/pipelines/9", target: "233f5976dd79438ad83fd7030d7bb1ad7ab005a3", peeled_target: "">]  
    From this we can see that the merge-head is correctly only referred to by the scoped ref and the two "extra" keep-arounds in the baseline afd50ffc325afcff37bf05ecd1327aa2f854d811 and 27cd9fb30f025d20cd0f0ce47bad120816bcc679 are both merge commits.
Edited by James Fargher

Merge request reports

Loading