Capture queries during rspec using pg auto_explain (!125454) · Merge requests · GitLab.org / GitLab

Matt Kasa requested to merge 414325-record-auto-explain into master Jul 05, 2023

What does this MR do and why?

Captures each unique sql query with an example query plan during rspec testing, deduplicates them, and creates an artifact containing this data in the new rspec:merge-auto-explain-logs job.

Note: Never runs when pipeline:expedite is present, and currently only runs when the pipeline:record-queries label is present.

This is accomplished by the following procedure:

We enable the auto_explain module in postgres containers that are at least pg14, and configure it to log a plan and query text for every query to postgres's log, using auto_explain's json output format.
At the end of each spec run, we mount the postgres log file as a table, using postgres's file_fdw feature. Note that accessing the postgres log file as a table is an example in the file_fdw documentation!
We find the auto_explain entries by querying this log file for data that looks like json (specifically data that starts with {), since auto_explain is the only system writing json to the log file.
We deduplicate these queries locally in the job by query_id, a cheap hashed representation of the query text that is not portable between rspec jobs or if a table is dropped then recreated.
We calculate a fingerprint for each deduplicated query. This is a hash of the normalized query text that does not depend on the postgres instance running, and is comparable across jobs. We do this in each rspec job because it takes a few minutes if done during the merge process, but is very fast spread across all rspec jobs.
We write an artifact for each database and each job - a gzipped ndjson file with rows that have query, plan, fingerprint, normalized (where normalized is a normalized form of the query, with constants replaced with placeholders).
We merge and deduplicate each of these artifacts, producing a final artifact of ~5.8MB containing roughly 18,000 queries.

This artifact is the first step in enabling a few different advanced analyses of database queries:

We can diff it to detect new queries introduced in an MR, and decide if a database review is necessary based on this change.
We can inject changes to the database schema, and see how query plans react to those changes, for example to identify queries incompatible with partitioning.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Related to #414325 (closed)

Edited Jul 11, 2023 by Simon Tomlinson

Capture queries during rspec using pg auto_explain

What does this MR do and why?

MR acceptance checklist

Merge request reports