Optimize document parsing for Banzai ReferenceCache

Problem

Banzai processing relies on a chain of filters. Each filter receives the output of the previous filter, then processes and returns the modified content.

We have multiple ReferenceFilters that include a ReferenceCache. It generates a cached collection of parent references by parsing the markdown document. The parsing process can be time and memory-consuming for large documents.

The biggest problem is that we repeatedly parse the document for each ReferenceFilter. That multiplies the overhead.

Proposal

Avoid unnecessary document parsing calls
Parse the document once and allow reference filters to access it

Edited Sep 20, 2021 by Sean Carroll