Use simpler query for cancellable redundant pipelines
What does this MR do and why?
This MR removes some of the more expensive queries from the CancelRedundantPipelineService.
The only behavior we've changed is removing the 3-day cancellation window. That window is something we implemented in the past to mitigate the query performance, but in the new implementation it's faster if we ignore created_at
and remove the limitation window completely.
Update: While the time window has been removed from the query, we added logic to not cancel pipelines older than three days. For the time being, we'll log but not cancel those Pipelines, in order to understand how many older-than expected Pipelines we would cancel if we rolled it out for the entire table, as the query is optimzied for.
It's an idea we had in Spike: How to improve the long-term scalability... (#438101 - closed)
Query Changes
I used the QueryLogger to record the queries made in the CancelRedundantPipelineService specs, and diffed the logs. Here's a combined log, showing what's changed. I'll include query analysis of the changes below.
- SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.656030' ORDER BY "ci_pipelines"."status" ASC, "ci_pipelines"."created_at" ASC LIMIT 100 OFFSET 0 /*application:test,correlation_id:52c1b83013d524b0cc2f3a215a101a11,db_config_name:ci,line:/app/services/ci/pipeline_creation/cancel_redundant_pipelines_service.rb:52:in `block (2 levels) in paginator'*/
+ SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 84 AND "ci_pipelines"."ref" = 'master' AND "ci_pipelines"."id" != 387 AND ("ci_pipelines"."status" IN ('preparing','pending','running','waiting_for_callback','waiting_for_resource','created','scheduled','manual')) /*application:test,correlation_id:1b1e796be77fc1edb23be7e4d555c849,db_config_name:ci,line:/app/services/ci/pipeline_creation/cancel_redundant_pipelines_service.rb:85:in `cancelable_status_pipeline_ids'*/
- SAVEPOINT active_record_2 /*application:test,correlation_id:52c1b83013d524b0cc2f3a215a101a11,db_config_name:ci,line:/app/services/ci/pipeline_creation/cancel_redundant_pipelines_service.rb:33:in `block (2 levels) in execute'*/
- WITH RECURSIVE "base_and_descendants" AS ((SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.658343' AND "ci_pipelines"."ref" = 'master' AND "ci_pipelines"."sha" IS NOT NULL AND (created_at < '2024-05-08 03:47:40.616178') AND "ci_pipelines"."status" IN ('created', 'waiting_for_resource', 'preparing', 'waiting_for_callback', 'pending', 'running', 'success', 'failed', 'canceling', 'canceled', 'skipped', 'manual', 'scheduled') AND ("ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 10, 11) OR "ci_pipelines"."source" IS NULL) AND "ci_pipelines"."id" IN (404, 405)) UNION (SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines", "base_and_descendants", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."source_pipeline_id" = "base_and_descendants"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT id FROM "base_and_descendants" AS "ci_pipelines" WHERE "ci_pipelines"."status" IN ('preparing', 'pending', 'running', 'waiting_for_callback', 'waiting_for_resource', 'created', 'scheduled', 'manual')
SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 404
SELECT "ci_pipeline_metadata".* FROM "ci_pipeline_metadata" WHERE "ci_pipeline_metadata"."pipeline_id" = 404 LIMIT 1
SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 404 AND (NOT EXISTS (SELECT 1 FROM "p_ci_builds" INNER JOIN "p_ci_builds_metadata" ON "p_ci_builds_metadata"."partition_id" IS NOT NULL AND "p_ci_builds_metadata"."build_id" = "p_ci_builds"."id" AND "p_ci_builds_metadata"."partition_id" = "p_ci_builds"."partition_id" WHERE "p_ci_builds"."type" = 'Ci::Build' AND "p_ci_builds"."commit_id" = "ci_pipelines"."id" AND "p_ci_builds"."partition_id" = "ci_pipelines"."partition_id" AND ("p_ci_builds"."status" IN ('running','success','failed')) AND "p_ci_builds_metadata"."id" NOT IN (SELECT "p_ci_builds_metadata"."id" FROM "p_ci_builds_metadata" WHERE "p_ci_builds_metadata"."build_id" = "p_ci_builds"."id" AND "p_ci_builds_metadata"."partition_id" = "p_ci_builds"."partition_id" AND "p_ci_builds_metadata"."interruptible" = TRUE)))
SELECT 1 AS one FROM "p_ci_builds" WHERE "p_ci_builds"."commit_id" = 404 AND "p_ci_builds"."partition_id" = 102 AND "p_ci_builds"."status" IN ('preparing', 'pending', 'running', 'waiting_for_callback', 'waiting_for_resource', 'created', 'scheduled') LIMIT 1
UPDATE "ci_pipelines" SET "auto_canceled_by_id" = 405 WHERE "ci_pipelines"."id" = 404 AND "ci_pipelines"."lock_version" = 0
+ SAVEPOINT active_record_2 /*application:test,correlation_id:1b1e796be77fc1edb23be7e4d555c849,db_config_name:ci,line:/app/services/ci/cancel_pipeline_service.rb:90:in `block in cancel_jobs'*/
SELECT "p_ci_builds"."status", "p_ci_builds"."finished_at", "p_ci_builds"."created_at", "p_ci_builds"."updated_at", "p_ci_builds"."started_at", "p_ci_builds"."coverage", "p_ci_builds"."name", "p_ci_builds"."options", "p_ci_builds"."allow_failure", "p_ci_builds"."stage", "p_ci_builds"."trigger_request_id", "p_ci_builds"."stage_idx", "p_ci_builds"."tag", "p_ci_builds"."ref", "p_ci_builds"."type", "p_ci_builds"."target_url", "p_ci_builds"."description", "p_ci_builds"."erased_by_id", "p_ci_builds"."erased_at", "p_ci_builds"."artifacts_expire_at", "p_ci_builds"."environment", "p_ci_builds"."when", "p_ci_builds"."yaml_variables", "p_ci_builds"."queued_at", "p_ci_builds"."lock_version", "p_ci_builds"."coverage_regex", "p_ci_builds"."retried", "p_ci_builds"."protected", "p_ci_builds"."failure_reason", "p_ci_builds"."scheduled_at", "p_ci_builds"."token_encrypted", "p_ci_builds"."resource_group_id", "p_ci_builds"."waiting_for_resource_at", "p_ci_builds"."processed", "p_ci_builds"."scheduling_type", "p_ci_builds"."id", "p_ci_builds"."stage_id", "p_ci_builds"."partition_id", "p_ci_builds"."auto_canceled_by_partition_id", "p_ci_builds"."auto_canceled_by_id", "p_ci_builds"."commit_id", "p_ci_builds"."project_id", "p_ci_builds"."runner_id", "p_ci_builds"."upstream_pipeline_id", "p_ci_builds"."user_id" FROM "p_ci_builds" WHERE "p_ci_builds"."commit_id" = 404 AND "p_ci_builds"."partition_id" = 102 AND "p_ci_builds"."status" IN ('preparing', 'pending', 'running', 'waiting_for_callback', 'waiting_for_resource', 'created', 'scheduled') ORDER BY "p_ci_builds"."id" ASC LIMIT 1000
SELECT "p_ci_builds"."status", "p_ci_builds"."finished_at", "p_ci_builds"."created_at", "p_ci_builds"."updated_at", "p_ci_builds"."started_at", "p_ci_builds"."coverage", "p_ci_builds"."name", "p_ci_builds"."options", "p_ci_builds"."allow_failure", "p_ci_builds"."stage", "p_ci_builds"."trigger_request_id", "p_ci_builds"."stage_idx", "p_ci_builds"."tag", "p_ci_builds"."ref", "p_ci_builds"."type", "p_ci_builds"."target_url", "p_ci_builds"."description", "p_ci_builds"."erased_by_id", "p_ci_builds"."erased_at", "p_ci_builds"."artifacts_expire_at", "p_ci_builds"."environment", "p_ci_builds"."when", "p_ci_builds"."yaml_variables", "p_ci_builds"."queued_at", "p_ci_builds"."lock_version", "p_ci_builds"."coverage_regex", "p_ci_builds"."retried", "p_ci_builds"."protected", "p_ci_builds"."failure_reason", "p_ci_builds"."scheduled_at", "p_ci_builds"."token_encrypted", "p_ci_builds"."resource_group_id", "p_ci_builds"."waiting_for_resource_at", "p_ci_builds"."processed", "p_ci_builds"."scheduling_type", "p_ci_builds"."id", "p_ci_builds"."stage_id", "p_ci_builds"."partition_id", "p_ci_builds"."auto_canceled_by_partition_id", "p_ci_builds"."auto_canceled_by_id", "p_ci_builds"."commit_id", "p_ci_builds"."project_id", "p_ci_builds"."runner_id", "p_ci_builds"."upstream_pipeline_id", "p_ci_builds"."user_id" FROM "p_ci_builds" WHERE "p_ci_builds"."id" IN (915, 917)
SELECT "projects"."id", "projects"."name", "projects"."path", "projects"."description", "projects"."created_at", "projects"."updated_at", "projects"."creator_id", "projects"."namespace_id", "projects"."last_activity_at", "projects"."import_url", "projects"."visibility_level", "projects"."archived", "projects"."avatar", "projects"."merge_requests_template", "projects"."star_count", "projects"."merge_requests_rebase_enabled", "projects"."import_type", "projects"."import_source", "projects"."approvals_before_merge", "projects"."reset_approvals_on_push", "projects"."merge_requests_ff_only_enabled", "projects"."issues_template", "projects"."mirror", "projects"."mirror_last_update_at", "projects"."mirror_last_successful_update_at", "projects"."mirror_user_id", "projects"."shared_runners_enabled", "projects"."runners_token", "projects"."build_allow_git_fetch", "projects"."build_timeout", "projects"."mirror_trigger_builds", "projects"."pending_delete", "projects"."public_builds", "projects"."last_repository_check_failed", "projects"."last_repository_check_at", "projects"."only_allow_merge_if_pipeline_succeeds", "projects"."has_external_issue_tracker", "projects"."repository_storage", "projects"."repository_read_only", "projects"."request_access_enabled", "projects"."has_external_wiki", "projects"."ci_config_path", "projects"."lfs_enabled", "projects"."description_html", "projects"."only_allow_merge_if_all_discussions_are_resolved", "projects"."repository_size_limit", "projects"."printing_merge_request_link_enabled", "projects"."auto_cancel_pending_pipelines", "projects"."service_desk_enabled", "projects"."cached_markdown_version", "projects"."delete_error", "projects"."last_repository_updated_at", "projects"."disable_overriding_approvers_per_merge_request", "projects"."storage_version", "projects"."resolve_outdated_diff_discussions", "projects"."remote_mirror_available_overridden", "projects"."only_mirror_protected_branches", "projects"."pull_mirror_available_overridden", "projects"."jobs_cache_index", "projects"."external_authorization_classification_label", "projects"."mirror_overwrites_diverged_branches", "projects"."pages_https_only", "projects"."external_webhook_token", "projects"."packages_enabled", "projects"."merge_requests_author_approval", "projects"."pool_repository_id", "projects"."runners_token_encrypted", "projects"."bfg_object_map", "projects"."detected_repository_languages", "projects"."merge_requests_disable_committers_approval", "projects"."require_password_to_approve", "projects"."max_pages_size", "projects"."max_artifacts_size", "projects"."pull_mirror_branch_prefix", "projects"."remove_source_branch_after_merge", "projects"."marked_for_deletion_at", "projects"."marked_for_deletion_by_user_id", "projects"."autoclose_referenced_issues", "projects"."suggestion_commit_message", "projects"."project_namespace_id", "projects"."hidden", "projects"."organization_id" FROM "projects" WHERE "projects"."id" = 86
SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines" WHERE "ci_pipelines"."partition_id" = 102 AND "ci_pipelines"."id" = 404
SELECT "deployments".* FROM "deployments" WHERE "deployments"."deployable_type" = 'CommitStatus' AND "deployments"."deployable_id" IN (915, 917)
SELECT "taggings".* FROM "taggings" WHERE "taggings"."taggable_type" = 'CommitStatus' AND "taggings"."taggable_id" IN (915, 917) /*application:test,correlation_id:52c1b83013d524b0cc2f3a215a101a11,db_config_name:ci,line:/app/models/preloaders/commit_status_preloader.rb:16:in `block in execute'*/
SELECT "p_ci_builds_metadata".* FROM "p_ci_builds_metadata" WHERE "p_ci_builds_metadata"."build_id" = 915 AND "p_ci_builds_metadata"."partition_id" = 102 LIMIT 1
UPDATE "p_ci_builds" SET "status" = 'canceled', "finished_at" = '2024-05-08 03:47:40.699804', "updated_at" = '2024-05-08 03:47:40.700441', "processed" = FALSE, "auto_canceled_by_partition_id" = 102, "auto_canceled_by_id" = 405, "lock_version" = 1 WHERE "p_ci_builds"."id" = 915 AND "p_ci_builds"."partition_id" = 102 AND "p_ci_builds"."lock_version" = 0
DELETE FROM "ci_builds_runner_session" WHERE "ci_builds_runner_session"."build_id" = 915
UPDATE "p_ci_builds" SET "status" = 'canceled', "finished_at" = '2024-05-08 03:47:40.705090', "updated_at" = '2024-05-08 03:47:40.705501', "processed" = FALSE, "auto_canceled_by_partition_id" = 102, "auto_canceled_by_id" = 405, "lock_version" = 1 WHERE "p_ci_builds"."id" = 917 AND "p_ci_builds"."partition_id" = 102 AND "p_ci_builds"."lock_version" = 0
RELEASE SAVEPOINT active_record_2
SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids", "namespaces"."organization_id" FROM "namespaces" WHERE "namespaces"."id" = 171 LIMIT 1
SELECT "routes".* FROM "routes" WHERE "routes"."source_id" = 171 AND "routes"."source_type" = 'Namespace' LIMIT 1
- SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.717762' ORDER BY "ci_pipelines"."status" ASC, "ci_pipelines"."created_at" ASC LIMIT 100 OFFSET 100
Fetching initial batches of Pipelines to iterate over
For testing, I've used the master
branch of gitlab-org/gitlab
, as an example of a large, active project.
Removed
SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.656030' ORDER BY "ci_pipelines"."status" ASC, "ci_pipelines"."created_at" ASC LIMIT 100 OFFSET 0`
- Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/28171/commands/87759
- Usage: This query was used to paginate over all pipelines on the given ref created in the last three days. So, (3 days worth of CI pipelines / 100) executions.
Added
SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 278964 AND "ci_pipelines"."ref" = 'master' AND "ci_pipelines"."id" != 1312535944 AND ("ci_pipelines"."status" IN ('preparing','pending','running','waiting_for_callback','waiting_for_resource','created','scheduled','manual')) ORDER BY "ci_pipelines"."id" DESC LIMIT 3000;
- Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/28695/commands/89411
- Usage: This query will execute exactly one time and return the ID of all pipelines running on the ref (up to 3000 IDs), excluding the ID of the Pipeline that triggered the service.
Finding cancelable pipelines
Removed
WITH RECURSIVE "base_and_descendants" AS ((SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.658343' AND "ci_pipelines"."ref" = 'master' AND "ci_pipelines"."sha" IS NOT NULL AND (created_at < '2024-05-08 03:47:40.616178') AND "ci_pipelines"."status" IN ('created', 'waiting_for_resource', 'preparing', 'waiting_for_callback', 'pending', 'running', 'success', 'failed', 'canceling', 'canceled', 'skipped', 'manual', 'scheduled') AND ("ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 10, 11) OR "ci_pipelines"."source" IS NULL) AND "ci_pipelines"."id" IN (404, 405)) UNION (SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines", "base_and_descendants", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."source_pipeline_id" = "base_and_descendants"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT id FROM "base_and_descendants" AS "ci_pipelines" WHERE "ci_pipelines"."status" IN ('preparing', 'pending', 'running', 'waiting_for_callback', 'waiting_for_resource', 'created', 'scheduled', 'manual')
- Plan: https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/28171/commands/87765
- Usage: This query was used for each page in the pagination over all pipelines on the given ref created in the last three days. So, (3 days worth of CI pipelines / 100) executions. As we're no longer paginating and needing to find running children through the parent, we no longer use this query at all.
Added
SELECT "ci_pipelines"."id", "ci_pipelines"."ref", "ci_pipelines"."sha", "ci_pipelines"."before_sha", "ci_pipelines"."created_at", "ci_pipelines"."updated_at", "ci_pipelines"."tag", "ci_pipelines"."yaml_errors", "ci_pipelines"."committed_at", "ci_pipelines"."project_id", "ci_pipelines"."status", "ci_pipelines"."started_at", "ci_pipelines"."finished_at", "ci_pipelines"."duration", "ci_pipelines"."user_id", "ci_pipelines"."lock_version", "ci_pipelines"."pipeline_schedule_id", "ci_pipelines"."source", "ci_pipelines"."config_source", "ci_pipelines"."protected", "ci_pipelines"."failure_reason", "ci_pipelines"."iid", "ci_pipelines"."merge_request_id", "ci_pipelines"."source_sha", "ci_pipelines"."target_sha", "ci_pipelines"."external_pull_request_id", "ci_pipelines"."ci_ref_id", "ci_pipelines"."locked", "ci_pipelines"."partition_id", "ci_pipelines"."auto_canceled_by_id" FROM "ci_pipelines" WHERE "ci_pipelines"."id" IN (1312191877, 1312177136, 1312176990, 1312168544, 1312166809, 1312166008, 1312158544, 1312151206, 1312148554, 1312147017, 1312134166, 1312123450, 1312121173, 1312114875, 1312113062, 1312100083, 1311963067, 1257731175, 1257727855, 1185944549, 1180374149, 1180374070, 1168913923, 1163267428, 1163267381, 1119654103, 1119654017, 1000137498, 977653729, 971776449, 967815490, 898704652, 898524322, 808531280, 803202597, 738818411, 738804134, 738799167, 738794914, 684368196, 681521571, 588893050, 505137469, 330858161, 330855099, 330847165, 254748890, 158185640, 157092882, 152852912, 152127562, 152111663, 152109004, 152100496, 152084287, 152070853, 152007162, 151988529, 151946973, 151923838, 151882784, 151862627, 151862009, 151840166, 151834980, 151776260, 151745586, 151724377, 151694611, 151609479, 151597754, 151564214, 151513449, 150808983, 148772320, 147215396, 146190608, 146156627, 146112670, 146109719, 146066240, 146049811, 146045968, 146037257, 146035206, 146019949, 146011462, 146007174, 145945416, 145944568, 145932822, 145929965, 145924499, 145924213, 145922299, 145919990, 145906646, 145903984, 145834722, 145802579, 145739283, 145730728, 145681542, 145666808, 145641381, 145638168, 145633878, 145605914, 145603316, 145590726, 145587174, 145569904, 145550653, 145547096, 145545469, 145537062, 145532205, 145518682, 145494479, 145469428, 145455636, 145451881, 145451425, 145434086, 145403143, 145311152, 145310859, 145271706, 145260476, 145171568, 145163612, 145161078, 145156069, 145154942, 145154609, 145122630, 145103331, 145080005, 145077689, 145073203, 145071496, 145071332, 145069439, 145067693, 145063919, 143890662, 141442026, 141436730, 141433771, 141410715, 141402588, 141386724, 141329474, 141322578, 141247813, 141247288, 141229244, 141063149, 141058114, 141051118, 141016316, 141004021, 140991446, 140985622, 140984925, 140973596, 140944779, 140899945, 140896576, 140890604, 140870576, 140835333, 140834669, 140782107, 140750356, 140723621, 140665973, 140642011, 140624957, 140589942, 140584876, 140572444, 140560731, 140545254, 140446519, 140433764, 140332524, 140297066, 140290066, 140178833, 140169388, 140166780, 139578404, 139576297, 139543491, 138128996, 136644656, 136608996, 136222586, 135923423, 133158729, 129622461, 126020103, 124802599, 124704091, 123085031, 122693489, 115294004, 114993538, 113154633, 112100244, 110022225, 102337176, 102322606, 102313419, 102312584, 102308139, 102301776, 102299282, 99557994, 97811434, 97462700, 97329755, 97322432, 97313862, 97263420, 97262329, 97251844, 96632803, 96613865, 96350500, 96196405, 89007259, 85472386, 27984779, 16338054, 16093646, 10669192);
- Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/28695/commands/89412
- Usage: We iterate through the IDs in batches of 1,000 and instantate actual Pipeline records, to check therit attributes and determine if we can cancel them. If this query isn't quite as fast as we expect, we could reduce the batch size or refactor the query. But we don't see any obvious issues so far.
Savepoint moved
SAVEPOINT active_record_2
- Usage: The same, just a few more queries happening earlier.
Get-next-page query removed, since we're no longer paginating:
SELECT "ci_pipelines"."id" FROM "ci_pipelines" WHERE "ci_pipelines"."project_id" = 86 AND "ci_pipelines"."created_at" > '2024-05-05 03:47:40.717762' ORDER BY "ci_pipelines"."status" ASC, "ci_pipelines"."created_at" ASC LIMIT 100 OFFSET 100
- Plan: Same as the first, but with a bigger/growing offset.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Post-deployment verification
I've created a Kibana dashboard tracking total DB time, number of pipelines cancelled, and maximum, average, and median query counts for the Ci::CancelRedundantPipelinesWorker
.
When we enable the flag, the biggest changes we should see will be drops in total DB time and maximum number of queries fired by a single execution of the worker. Average and median number of queries, along with average DB time, might also drop a little bit. We do not expect anything to notably increase. The number of Pipelines cancelled should not show any change.
There are separate panels for the gitlab-org/gitlab
project, and then SaaS as a whole. We'll enable the new query first for SaaS, and let some self-managed customers know they can try it out. If everything goes well, we'll enable it by default in %17.1.
Rollout issue: #463623 (closed)
Related to #438101 (closed)