Skip to content

Fix project_settings#has_vulnerabilities data inconsistency issue

What does this MR do and why?

When a project has vulnerabilities, we mark the associated project_settings#has_vulnerabilities column as true to efficiently filter projects with vulnerabilities. Due to a bug that has already been fixed, there are some projects with vulnerabilities that are not marked as has_vulnerabilities.

This MR introduces a new batched background migration to fix that data consistency issue.

Related to Some projects with vulnerabilities does not hav... (#493872 - closed).

Database review

There are currently only 300 records in the entire database in inconsistent state so in total this migration will update only 300 records but this number can be different for self-hosted installations.

The estimation of the db testing job is off because it calculates the estimation based on the total number of tuples in the vulnerability_reads table but we are running a loose-index scan to iterate over unique project IDs and there are only 183_774 unique project IDs in that table. Based on this, the estimation would be (183_774 / 1000) * 2 / 60 ~= 6 hours.

Here is the update query and its execution plan
UPDATE
    "project_settings"
SET
    "has_vulnerabilities" = TRUE
WHERE
    "project_settings"."project_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)
    AND (has_vulnerabilities IS NOT TRUE)

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/32285/commands/99726

Edited by Mehmet Emin INAC

Merge request reports

Loading