Fastzip fails to archive cache due to temp path collision
Summary
It appears as if the current implementation of fastzip at least on Windows contains a temp path collision in the fastzip implementation.
Steps to reproduce
- Set up a gitlab runner on a Windows machine.
- Register it to one or multiple GitLab instances with/as
shell
executor - Set up a project that has multiple jobs with independent cache keys.
- Trigger a pipeline that will cause those jobs to run on that runner concurrently and will cause them to archive their caches at the same time.
- Observe the reported log-messages.
I'm sorry, I cannot provide better steps to reproduce or a demo-project at this time, if you really need that, please let me know and I'll take the time to derive a reproduction setup that I am allowed to publish.
Actual behavior
One of the jobs logged this while trying to zip the cache:
Creating cache DevBuild-Benchmark-Win64-1...
Runtime platform arch=amd64 os=windows pid=3172 revision=943fc252 version=13.7.0
<redacted>: found 56277 matching files and directories
FATAL: remove C:\WINDOWS\TEMP\fastzip_00: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_01: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_02: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_03: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_04: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_05: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_06: The process cannot access the file because it is being used by another process.
remove C:\WINDOWS\TEMP\fastzip_07: The process cannot access the file because it is being used by another process.
Failed to create cache
The paths do not appear to contain any means for avoiding collisions between concurrently running jobs or even between different applications using a similar implementation.
Expected behavior
Creating cache DevBuild-Benchmark-Win64-1...
Runtime platform arch=amd64 os=windows pid=388 revision=943fc252 version=13.7.0
<redacted>: found 56277 matching files and directories
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally.
Created cache
Relevant logs and/or screenshots
Environment description
The issue occurred for us with a GitLab runner with the following properties:
-
shell
executor (likely relevant) - with
concurrent=2
,limit=2
(likely relevant) - multiple jobs running and pending (likely relevant)
-
FF_USE_FASTZIP
is set to1
for the involved projects. (likely relevant) - running on a Windows machine (possibly relevant)
- registered to multiple different GitLab instances (probably not relevant)
This is a self-hosted runner registered to multiple self-hosted GitLab instances operated by different parties.
config.toml contents
concurrent = 2
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "[redacted]"
url = "[redacted]"
token = "[redacted]"
limit = 1
executor = "shell"
builds_dir = "C:/gitlab-1/b"
cache_dir = "C:/gitlab-1/c"
shell = "cmd"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[[runners]]
name = "[redacted]"
url = "[redacted]"
token = "[redacted]"
limit = 2
executor = "shell"
builds_dir = "C:/gitlab-2/b"
cache_dir = "C:/gitlab-2/c"
shell = "cmd"
output_limit = 40960
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
The second token was used to run the jobs in question.
Used GitLab Runner version
/gitlab-runner-windows-amd64.exe --version
Version: 13.7.0
Git revision: 943fc252
Git branch: 13-7-stable
GO version: go1.13.8
Built: 2020-12-21T13:47:18+0000
OS/Arch: windows/amd64
Possible fixes
A very brief search lead to this line: https://gitlab.com/gitlab-org/gitlab-runner/-/blob/master/vendor/github.com/saracen/fastzip/internal/filepool/filepool.go#L145