Sometimes object storage migration doesn't clean up source files in file storage
Problem
It seems there is a race condition that local files are not removed properly after it's uploaded to object storage.
- Job trace files: https://gitlab.com/gitlab-com/infrastructure/issues/3658#note_74218061
- Job artifact files: https://www.google.com/url?hl=en&q=https://gitlab.com/gitlab-com/migration/issues/311%23note_73657945&source=gmail&ust=1527138742192000&usg=AFQjCNFn8euBzBP_plaaWCK4wybQEbYNlg
- Lfs files: gitlab-com/migration#252 (comment 73043290)
At the moment, we don't have any clues how this happens. From my assumption, even if sidekiq runs concurrently, the second request should be blocked by the exclusive lease.
/cc @mbergeron @ayufan @ahanselka
Proposal
We prepare a rake task to clean up the duplicated files.
Edited by Shinya Maeda