Use ULIDs for CorrelationIDs
This change switches correlationIDs from Random strings to ULIDs.
What is a ULID?
Universally Unique Lexicographically Sortable Identifier
UUID can be suboptimal for many uses-cases because:
- It isn't the most character efficient way of encoding 128 bits of randomness
- UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
- UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
- UUID v4 provides no other information than randomness which can cause fragmentation in many data structures
Instead, herein is proposed ULID:
- 128-bit compatibility with UUID
- 1.21e+24 unique ULIDs per millisecond
- Lexicographically sortable!
- Canonically encoded as a 26 character string, as opposed to the 36 character UUID
- Uses Crockford's base32 for better efficiency and readability (5 bits per character)
- Case insensitive
- No special characters (URL safe)
- Monotonic sort order (correctly detects and handles the same millisecond)
ULID are attractive for using as correlationIDs for one of their properties: they are lexicographically sortable, meaning that it's very easy to determine the order of requests, given only a set of correlationIDS. The time of generation can even be determined from the ULID.
Switching to ULID (actually, the similar KSUIDs) is something I've wanted to do for a long time, but much of the code in !78 (merged) led me to realise that now is a good time to roll this out.
SafeRandomID()
This change also deprecates correlation.RandomID()
in favour of correlation.SafeRandomID()
. correlation.RandomID()
returned an error, although no request should ever be cancelled due to the failure of a correlationID to be generated. This led to different applications using different fallback strategies.
SafeRandomID()
ensures that a correlationID is always returned, even in the exceedingly unlikely case that the system does not have enough entropy.
Instead a simple fallback, of E:<encodedtimestamp>
is now used as the correlationId when the system does not have enough entropy.
Benchmark
pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8 7097577 165 ns/op
PASS
For prosperity, below is the benchmark used for the KSUID implementation that we used previously. ULID generation is a little faster (although both are fine).
pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8 2266785 530 ns/op
PASS
cc @ash2k