Skip to content

Allocate EntryMap keys only when needed

Will Chandler (ex-GitLab) requested to merge wc/borrow-file-entry into master

Currently we unconditionally copy each JSON string and pid into a new EntryData struct as we process the buffer.

The current lifetime of JSON strings is:

graph TD
  A[Copy file to buffer] --> B
  B[Copy JSON to EntryData] --> C
  C[Hash JSON and check if present in map] --> D
  C --> E
  D[If no, move owned copy of JSON to map]
  E[If yes, free copied JSON]

This is how the C implementation did things, so it was a reasonable place to start. However, given that we have 12 Puma worker threads in production, keys that are not pid-significant will be replicated many times over. Each EntryData struct we allocate is consumed before we have finished processing the file buffer, so we can borrow directly from that buffer instead of copying each chunk of JSON out of it unconditionally.

Using the RawEntry API available on the hashbrown crate, we can avoid allocating a new EntryData unless the key is not present in the map.

The new lifetime pattern is:

graph TD
  A[Copy file to buffer] --> B
  B[Borrow JSON from buffer] --> C
  C[Hash JSON and check if present in map] --> D
  D[If no, copy it into map as key]

hashbrown is currently the underlying hashmap implementation used by std::collections::HashMap, so there's no change to the map implementation itself, but std does not expose the RawEntry API in stable. We no longer need to depend on ahash directly as this is the default hasher used by hashbrown.

This change provides a ~15% performance improvement, getting us above 5x faster than the C implementation:

  Warming up --------------------------------------
                     C     1.000  i/100ms
                  rust     2.000  i/100ms
  Calculating -------------------------------------
                     C      5.760  (± 0.0%) i/s -     29.000  in   5.035721s
                  rust     30.186  (± 3.3%) i/s -    152.000  in   5.039813s

  Comparison:
                  rust:       30.2 i/s
                     C:        5.8 i/s - 5.24x  slower

With borrowed key check:

  Benchmark 1: bundle exec ./bin/benchmark
    Time (mean ± σ):      3.623 s ±  0.033 s    [User: 2.954 s, System: 0.643 s]
    Range (min … max):    3.565 s …  3.667 s    10 runs

With unconditional copy of JSON:

  Benchmark 1: bundle exec ./bin/benchmark
    Time (mean ± σ):      4.273 s ±  0.081 s    [User: 3.589 s, System: 0.660 s]
    Range (min … max):    4.206 s …  4.458 s    10 runs

Merge request reports

Loading