Add filtering of pipelines by name to PipelinesFinder (!105852) · Merge requests · GitLab.org / GitLab

Reuben Pereira requested to merge rp/pipeline-title-finde-by-name into master Dec 02, 2022

What does this MR do and why?

Describe in detail what your merge request does and why.

Add filtering of pipelines by name to PipelinesFinder.
Add an index on pipeline_id, lower(name) to ci_pipeline_metadata, and remove the existing index.

The ci_pipeline_metadata table was added in 15.6, and the feature flag to use it has not been enabled on gitlab.com yet. So the migrations should be quick.

Performance of query to find pipelines by name:

	Index 1 (`pipeline_id, name`)	Index 2 (`pipeline_id, name text_pattern_ops`)	Index 3 (`pipeline_id, lower(name) text_pattern_ops`)	Notes
Query with single argument (`IN 'qa pipeline'`)	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48043	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48057	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48064	Index 3 seems to be the best for this query since it does a parallel index scan on `ci_pipeline_metadata` without any "Rows Removed by Filter". The other two indices do parallel index-only scans, but they have "Rows removed by filter". If all 3 indices are present, the planner chooses to use index 3.
Query with multiple arguments (`IN ('build pipeline', 'qa pipeline')`)	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48042	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48060	https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/13663/commands/48072	Index 1 & 2 seem better than index 3 here since index 3 results in a parallel index scan on `ci_pipeline_metadata` while indices 1 & 2 result in a parallel index-only scan, though all 3 indices have "Rows Removed by Filter". If all 3 indices are present, the planner chooses to use index 2.

The pipeline_id, name and pipeline_id, name text_pattern_ops indices perform better with multiple arguments (IN ('build pipeline', 'qa pipeline')), but the pipeline_id, lower(name) text_pattern_ops index performs better with a single argument (IN 'qa pipeline').

Since there is no data in the ci_pipeline_metadata table currently, I added fake data using the following query:

insert into ci_pipeline_metadata (pipeline_id, project_id, name) select id, project_id, ('[0:6]={Ruby pipeline,Deploy GSTG,Deploy GSTG-CNY,Build pipeline,Deploy GPRD,QA pipeline,release-tools pipeline}'::text[])[trunc(random()*7)] from ci_pipelines where project_id = 278964

Migration:

main: == 20221202154128 AddPipelineMetadataNameIndex: migrating =====================
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0036s
main: -- index_exists?(:ci_pipeline_metadata, "pipeline_id, lower(name) text_pattern_ops", {:name=>"index_pipeline_metadata_on_pipeline_id_name_lower_text_pattern", :algorithm=>:concurrently})
main:    -> 0.0151s
main: -- Index not created because it already exists (this may be due to an aborted migration or similar): table_name: ci_pipeline_metadata, column_name: pipeline_id, lower(name) text_pattern_ops
main: == 20221202154128 AddPipelineMetadataNameIndex: migrated (0.0292s) ============

main: == 20221202154151 RemovePipelineMetadataPipelineIdIndex: migrating ============
main: -- transaction_open?()
main:    -> 0.0001s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0016s
main: -- indexes(:ci_pipeline_metadata)
main:    -> 0.0038s
main: -- current_schema()
main:    -> 0.0005s
main: == 20221202154151 RemovePipelineMetadataPipelineIdIndex: migrated (0.0162s) ===

Revert migration

main: == 20221202154151 RemovePipelineMetadataPipelineIdIndex: reverting ============
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0083s
main: -- index_exists?(:ci_pipeline_metadata, [:pipeline_id, :name], {:name=>"index_ci_pipeline_metadata_on_pipeline_id_name", :algorithm=>:concurrently})
main:    -> 0.0060s
main: -- add_index(:ci_pipeline_metadata, [:pipeline_id, :name], {:name=>"index_ci_pipeline_metadata_on_pipeline_id_name", :algorithm=>:concurrently})
main:    -> 0.0098s
main: == 20221202154151 RemovePipelineMetadataPipelineIdIndex: reverted (0.0391s) ===

main: == 20221202154128 AddPipelineMetadataNameIndex: reverting =====================
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0024s
main: -- indexes(:ci_pipeline_metadata)
main:    -> 0.0059s
main: -- remove_index(:ci_pipeline_metadata, {:algorithm=>:concurrently, :name=>"index_pipeline_metadata_on_pipeline_id_name_lower_text_pattern"})
main:    -> 0.0038s
main: == 20221202154128 AddPipelineMetadataNameIndex: reverted (0.0251s) ============

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited Dec 06, 2022 by Reuben Pereira

Add filtering of pipelines by name to PipelinesFinder

What does this MR do and why?

Screenshots or screen recordings

How to set up and validate locally

MR acceptance checklist

Merge request reports