Cleaned Commits can still be accessed after running Repository cleanup
Summary
Sensitive information committed to GitLab cannot be removed from GitLab.
gitlab-foss!26555 (merged) should remove OIDs in order to fulfill gitlab-foss#30093 (closed). However, gitlab-foss#30093 (closed) was closed without actually achieving to remove sensitive information.
Steps to reproduce
- Set up a non-empty repository
- Commit a "secret" file
- Remove the "secret" file in a second commit
- use BFG to clean the sensitive information (See https://gitlab.com/help/user/project/repository/reducing_the_repo_size_using_git.md)
- Use "Repository cleanup" to make GitLab aware of the changes (Settings -> Repository -> Repository cleanup)
- Optionally run Housekeeping - it doesn't change anything. You can also more aggressively garbage collet the commits using
git reflog expire --expire=now --all && git gc --prune=now --aggressive
and force push your changes. Even thoughgit cat-file -t <SHA>
then proves the OIDs are gone locally, they still exist on GitLab.
Example Project
https://gitlab.com/Kirchhof/remove-sensitive-info-project/commits/master
What is the current bug behavior?
In commit e45649f69c93ab6ded0e9958ea64e1d78dbf1eda, I added a "sensitive.txt" file. Even though I used BFG to clean the commit history, the commit can still be accessed through the direct link Kirchhof/remove-sensitive-info-project@e45649f6
And even worse, the old commit SHAs including the direct links still show up in the activity log (Project -> Activity)
And the worst: If you access the SHA through the link, you can download the whole repo including the sensitive files if you click "Browse files". If you tag the "deleted" commit, which is possible even after running "Repository cleanup", you can even check out the whole tree again.
What is the expected correct behavior?
Kirchhof/remove-sensitive-info-project@e45649f6 should lead to a 404 not found
. And ideally, the commit should also not show up with its old SHA on the activity page.
Output of checks
This bug happens on GitLab.com - but also on self-hosted instances.
/cc @nick.thomas