Blob search should match on the blob id
Release notes
Code and Wiki search will now match documents on the underlying Git object ID.
Problem to solve
Currently, whenever a user searches for a blob (repository or wiki search), we only match on the document's file_name
and content
1 for the search terms.
As Git is a content-addressable storage system, I think we should leverage that feature by allowing users to search for a Blob's object ID (oid), which represents uniquely the file's content, in full, as a single SHA (i.e. ffded2bb9b398af20fbc2f3e11c74b546f4c9764
). This feature would be helpful for debugging purposes, whenever we want to ensure a document is present in the index.
The best implementation path here would be using a search filter, blob:<object-id>
or oid:<object_id>
.
Intended users
I think this feature is mainly useful for debugging purpose, so Developers, Admins.
User experience goal
The goal is to have a simple way to ensure a document is present in the index.
Supposing the user has access to the document, running git hash-object <file-path>
should yield the Blob's object ID. Using this same object ID in the GitLab should result in that exact document, if it exists.
Proposal
Include a way for users to search for a specific Blob object ID.
Further details
Permissions and Security
There might be a case for a user to search for a specific file content across GitLab to scan for vulnerabilities in public projects. Keep in mind that the Blob's object ID will change everytime the content changes and there are most likely other ways to scan as such already.