Backend: Runner cache prefix causes incorrect cache to be used
Summary
Runner cache prefix causes incorrect cache to be used
Steps to reproduce
Example .gitlab-ci.yml
stages: - build - test - deploy build-job: stage: build script: - echo "Compiled executable for $CI_JOB_NAME" > build/build-file - cat build/build-file cache: key: files: - file1.lock paths: - build/build-file key: files: - file2.lock paths: - build/example test-job: stage: test script: - echo "Compiled executable for $CI_JOB_NAME" > build/build-file - cat build/build-file cache: key: files: - file2.lock paths: - build/example key: files: - file3.lock paths: - build/build-file deploy-job: stage: deploy script: - cat build/build-file cache: key: files: - file1.lock paths: - build/build-file key: files: - file2.lock paths: - build/example ```
Example Project
https://gitlab.com/mbadeau/issue-388374
What is the current bug behavior?
Cache names are stored using a prefix and the commit SHA for the file named in the key:files:
keyword. Currently the prefix is based on an index number. This index number is based on ordering per job which causes issues with cache mismatches between jobs.
Each cache is now unique and regenerated for every job unless both the order of the cache:key:file
keywords are the same between each job and they point to the same files.
Easier to understand example as this confused myself re-reading it:
- Job 1 creates a cache using two
key:files
keywords.key:files:lockfile1.json
andkey:files:lockfile2.json
- This creates
0-COMMITSHA
and1-COMMITSHA
- Job 2 also creates a cache using two
key:files
keywords.key:files:lockfile2.json
andkey:files:lockfile3.json
- This again creates
0-COMMITSHA
and1-COMMITSHA
--- but these are not associated with the correct keys. They are simply based on the order. Even re-orderinglockfile1.json
andlockfile2.json
will cause this issue
What is the expected correct behavior?
Each job should respect a key:files
cache throughout the entire pipeline. For example, key:files:backend/package-lock.json
and key:files:frontend/package-lock.json
should have a unique cache and prefix and not based on index order when added to the job.
Based on the easier to understand example:
- Job 1 creates a cache using two
key:files
keywords.key:files:lockfile1.json
andkey:files:lockfile2.json
- This creates
0-COMMITSHA
and1-COMMITSHA
- Job 2 also creates a cache using two
key:files
keywords.key:files:lockfile2.json
andkey:files:lockfile3.json
- This again creates
1-COMMITSHA
and2-COMMITSHA
- Job 3 creates a cache using two
key:files
keywords.key:files:lockfile2.json
andkey:files:lockfile1.json
- This again creates
1-COMMITSHA
and0-COMMITSHA
Relevant logs and/or screenshots
Output of checks
This bug happens on GitLab.com
/label reproduced on GitLab.com
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true
)(we will only investigate if the tests are passing)
Possible fixes
Partially revert !104885 (merged) where the collisions were prevented via an index prefix. Instead have the prefix generated based on the files listed in cache:key:files
. (Additionally ensure that they can also be listed out of order)