Use auto_explain to record queries with plans
What does this MR do and why?
Captures each unique sql query with an example query plan during rspec testing, deduplicates them, and creates an artifact containing this data in the new rspec:merge-auto-explain-logs
job.
Note: Never runs when pipeline:expedite is present, and currently only runs when the pipeline:record-queries label is present.
This is accomplished by the following procedure:
- We enable the
auto_explain
module in postgres containers that are at least pg14, and configure it to log a plan and query text for every query to postgres's log, usingauto_explain
's json output format. - At the end of each spec run, we mount the postgres log file as a table, using postgres's
file_fdw
feature. Note that accessing the postgres log file as a table is an example in thefile_fdw
documentation! - We find the
auto_explain
entries by querying this log file for data that looks like json (specifically data that starts with{
), sinceauto_explain
is the only system writing json to the log file. - We deduplicate these queries locally in the job by
query_id
, a cheap hashed representation of the query text that is not portable between rspec jobs. - We calculate a
fingerprint
for each deduplicated query. This is a hash of the normalized query text that does not depend on the postgres instance running, and is comparable across jobs. We do this in each rspec job because it takes a few minutes if done during the merge process, but is very fast spread across all rspec jobs. - We write an artifact for each database and each job - a gzipped ndjson file with rows that have
query, plan, fingerprint, normalized
(where normalized is a normalized form of the query, with constants replaced with placeholders). - We merge and deduplicate each of these artifacts, producing a final artifact of ~5.8MB containing roughly 18,000 queries.
This artifact is the first step in enabling a few different advanced analyses of database queries:
- We can diff it to detect new queries introduced in an MR, and decide if a database review is necessary based on this change.
- We can inject changes to the database schema, and see how query plans react to those changes, for example to identify queries incompatible with partitioning.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Matt Kasa