Skip to content

Drop invalid records from vulnerability_feedback table

Michał Zając requested to merge 324893-drop-invalid-feedbacks into master

What does this MR do and why?

Related to #372251 (closed)

Database review

Total count: 380 282

SELECT count(*) FROM vulnerability_feedback;

Rows without matching uuid

Count: 106724

SELECT COUNT(v_f.finding_uuid)
FROM vulnerability_feedback v_f
LEFT JOIN vulnerability_occurrences v_o
ON v_f.finding_uuid = v_o.uuid::uuid
WHERE v_o.uuid IS NULL
AND v_f.finding_uuid IS NOT NULL;

Rows without matching project_fingerprint

Count: 7638

SELECT COUNT(*)
FROM vulnerability_feedback v_f
LEFT JOIN vulnerability_occurrences v_o
ON v_f.project_fingerprint::bytea = v_o.project_fingerprint
WHERE v_f.finding_uuid IS NULL
AND v_o.id IS NULL;

Query plan

TODO: This is really slow, either add index, rewrite this or make two separate migrations

Batch selection query
SELECT
  "vulnerability_feedback".*
FROM
  (
    (
      SELECT
        "vulnerability_feedback".*
      FROM
        "vulnerability_feedback"
        LEFT JOIN vulnerability_occurrences ON vulnerability_feedback.finding_uuid = vulnerability_occurrences.uuid :: uuid
      WHERE
        (
          vulnerability_feedback.finding_uuid IS NOT NULL
          AND vulnerability_occurrences.id IS NULL
        )
    )
    UNION
      (
        SELECT
          "vulnerability_feedback".*
        FROM
          "vulnerability_feedback"
          LEFT JOIN vulnerability_occurrences ON vulnerability_feedback.project_fingerprint :: bytea = vulnerability_occurrences.project_fingerprint
        WHERE
          (
            vulnerability_feedback.finding_uuid IS NULL
            AND vulnerability_occurrences.id IS NULL
          )
      )
  ) vulnerability_feedback
WHERE
  "vulnerability_feedback"."id" BETWEEN 1
  AND 100
  AND "vulnerability_feedback"."id" >= 1

Cold cache:

Warm cache:

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michał Zając

Merge request reports

Loading