Skip to content

Remove dependencies on Linguist

This saves about 128 MB of baseline RAM usage per Unicorn and Sidekiq process (!).

Linguist wasn't detecting languages anymore from CE/EE since 9ae8b574. However, Linguist::BlobHelper was still being depended on by BlobLike and others.

This removes the Linguist gem, given it isn't required anymore. EscapeUtils were pulled in as dependency, but given Banzai depends on it, it is now added explicitly.

Previously, Linguist was used to detect the best ACE mode. Instead, we rely on ACE to guess the best mode based on the file extension.

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/35450

But also fixes workflow problems like: gitaly!821 (comment 92251967)

Memory Usage (from derailed_benchmarks)

Before

TOP: 216.0078 MiB
  linguist: 128.4023 MiB
    linguist/language: 117.8516 MiB (Also required by: linguist/lazy_blob)
    linguist/blob_helper: 10.3281 MiB (Also required by: linguist/file_blob, linguist/blob, and 2 others)
      mime/types: 8.8125 MiB (Also required by: mime/types/columnar, /Users/stanhu/.rbenv/versions/2.4.4/lib/ruby/gems/2.4.0/gems/rest-client-2.0.2/lib/restclient/request, and 2 others)
        mime/types/registry: 8.0898 MiB
      charlock_holmes: 0.832 MiB (Also required by: TOP)
        charlock_holmes/encoding_detector: 0.4063 MiB
        charlock_holmes/charlock_holmes: 0.3945 MiB
  rails/all: 20.7383 MiB

derailed-before.txt

After:

TOP: 133.5313 MiB
  rails/all: 20.6719 MiB

derailed-no-linguist.txt

Edited by Stan Hu

Merge request reports

Loading