Add a UUID to each Diff File when the raw data is processed
What does this MR do?
For #33867.
Diff Files don't have a unique identifier. This MR adds one to diff files on the front end using a combination of values that - together - are always unique.
-
file.blob.id
= SHA1 of the git blob, which is sometimes unique, but not frequently enough (in one MR with 6 files, this was duplicated once) -
file.diff_refs.{base,start,head}_sha
=base_sha
&start_sha
are often identical across many MRs, andhead_sha
will only be unique in a given MR - but not unique for any file in that MR -
file.file_identifier_hash
= SHA1 of${file_path}-${new}-${deleted}-${renamed}
, should be unique in a given MR, but no uniqueness in a project / across MR versions -
file.blob.mode
= Never unique, just the file mode number
All six of these are used to get a unique ID for a diff file. By combining blob.id
and {base,start,head}_sha
we should be able to roughly pinpoint the commit and source file we're dealing with. By combining file_identifier_hash
and blob.mode
we should be able to identify a certain iteration of that source file in that commit.
Together, file_identifier_hash
and blob.mode
identify a diff file uniquely within a single MR.
Together, blob.id
and diff_refs.{base,start,head}_sha
identify a given source file across any MR.
Both of those combinations uniquely identify any diff file across any MR.
Screenshots
N/A, all ~backstage
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides - [-] Database guides
- [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers - [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team