Skip to content

Backfill values for cluster_agents.has_vulnerabilities

What does this MR do and why?

In !90187 (merged) we have added new column cluster_agents.has_vulnerabilities, in this MR we are adding background migration to backfill this value for older records.

Queries

explain UPDATE
  "cluster_agents"
SET
  "has_vulnerabilities" = TRUE
WHERE
  "cluster_agents"."id" IN (
    SELECT
      "cluster_agents"."id"
    FROM
      "cluster_agents"
      INNER JOIN vulnerability_reads ON vulnerability_reads.casted_cluster_agent_id = cluster_agents.id
      AND vulnerability_reads.project_id = cluster_agents.project_id
      AND vulnerability_reads.report_type = 7
    WHERE
      "cluster_agents"."id" BETWEEN 7000 AND 8000
      AND "cluster_agents"."id" >= 7900
      AND "cluster_agents"."id" < 8000
  )
Time: 83.680 ms
  - planning: 43.368 ms
  - execution: 40.312 ms
    - I/O read: 39.515 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 89 (~712.00 KiB) from the buffer pool
  - reads: 9 (~72.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/11136/commands/39858

As cluster_agents table is quite small (~8100 rows), we are not modifying defaults for background migration (single batch 1000, sub batch 100, delay interval 2.minutes):

We have UPDATE query that takes < 500ms for 100 rows, so single batch will take 10 * 500ms => 5000ms => 5.seconds.

We will have ~9 batches, so the migration will take around: 9 * 2.minutes = 18.minutes.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to GraphQL filter for vulnerabilities by container... (#358763 - closed)

Edited by Alan (Maciej) Paruszewski

Merge request reports

Loading