Search is uselessly slow with some search terms
Reposted from Zulip as this appears nontrivial. In our Antora project, certain search terms hang up the search UI for ca. 45 seconds in Chromium, popping up several "unresponsive" dialogs, before returning results. Going to the site and typing 'VK' in the search box is fine; typing 'VK_' causes the slowdown.
There are tens of thousands of words starting with 'VK_' in the generated search-index.js, and other common API prefixes with many matches have similar behavior. This is unusably slow for what will be a very common search.
In the Zulip discussion, @eskwayrd did some diagnostics in Firefox and said
In Firefox, I get a drop-down dialog at the top of the page reporting "This page is slowing down Firefox. To speed up your browser, stop this page." I get the option to Stop or Debug Script.
When I click Debug Script, the developer tools appear pointing to the findTermPosition function in the antora-lunr extension's search-ui.js. This makes me think that the UI logic, and not lunr.js itself, is taking a long time to perform some operation.
@mojavelinux followed up to say that
A quick look at that function draws my eyes to this statement:
char.match(lunr.tokenizer.separator)
If it's true that this function is moving character by character through each result, that means it is running a regular expression for every character. That's going to be terribly inefficient. There are definitely better ways to write this logic, especially if the separator is a string. In that case, using an equality match would be so much faster. But it might even be better to avoid the whole for loop.
In general, code should do everything it can do to avoid the use of regular expressions until there is nothing else left to use. This is one of the techniques that makes Asciidoctor (and Antora) so fast. Regular expressions are one of the slowest operations in JavaScript and, while necessary, should only be used when nothing else will suffice.