Prevent stack overflowing with a lot of user references (!22247) · Merge requests · GitLab.org / GitLab

Sean McGivern requested to merge 119333-systemstackerror-stack-level-too-deep into master Dec 24, 2019

What does this MR do?

When an issue, MR, or epic contains a lot of user references (even to the same user), we use the method Banzai::ReferenceParser::BaseParser.collection_objects_for_ids to load those users and maintain the loaded users for the current request in a RequestStore cache.

This method calls Hash#values_at with the splatted IDs - so one argument per ID passed in. If enough IDs are passed in, it's possible for the arguments here to overflow the stack.

Let's say we have:

ids = [1] * 1_000_000; ids.length
#=> 1000000
cache = {}
#=> {}
cache.values_at(*ids).compact
# SystemStackError: stack level too deep

We could do this cache.values_at(*ids.uniq).compact, but that would fail in this case:

distinct_ids = 1.upto(1_000_000).to_a; distinct_ids.length
#=> 1000000
cache.values_at(*distinct_ids.uniq).compact
# SystemStackError: stack level too deep

So the best option seems to be to just write a slower, but non-stack-consuming, version:

Benchmark.realtime { ids.uniq.map { |id| cache[id] }.compact }
#=> 0.15192799968644977
Benchmark.realtime { distinct_ids.uniq.map { |id| cache[id] }.compact }
#=> 0.3621319998055696

To test this locally, you can replace the last issue's description with a lot of mentions of a user (here I've created a user with the username 't'), but it takes a long time:

problem = ('@t ' * 1_000_000)[0...1_000_000]; problem.length
#=> 1000000
Issue.last.update!(description: problem_string)

Visiting that issue in the UI will then overflow the stack. With this change, it won't (although it is very slow).

Closes #119333 (closed)

Edited Dec 24, 2019 by Sean McGivern

Prevent stack overflowing with a lot of user references

What does this MR do?

Merge request reports