Investigate autovacuum performance impacts of CI partitioning
As part of the CI partitioning effort, we plan to partition CI tables by a logical partition identifier, correlated with build creation time, as outlined in https://docs.gitlab.com/ee/architecture/blueprints/ci_data_decay/pipeline_partitioning.html#implementing-a-time-decay-pattern-using-partitioning. Once the data for a partition identifier is old enough, we will mark it as "archived" in a queueing table, and stop writing to any row with that partition identifier.
We hope that this will improve autovacuum performance on CI tables in the following way:
- We mark old partition identifiers as "archived" and stop writing to any rows with them.
- Eventually, an old partition will only have "archived" rows.
- Autovacuum will eventually freeze all rows in that table, and update the visibility map.
- Autovacuum will never again need to scan or modify that table.
Additionally, we hope that autovacuum performance in partitions that are receiving updates will be improved in two ways:
-
Rows that are frequently receiving updates will be more likely to occupy the same disk page, so autovacuum will process fewer pages.
-
Autovacuum will be able to process multiple partitions in parallel.
-
Estimate the impacts that CI partitioning will have on autovacuum performance