Populate missing dismissal information for vulnerabilities
What does this MR do?
This MR populates missing dismissal information for the vulnerability records with an inline data migration.
Related to #287803 (closed).
Database review
There are only 510 records to be migrated with this migration, this is why an inline migration is chosen.
The lookup queries take less than a second on production replica with this script.
Vulnerability.include(EachBatch)
start = Gitlab::Metrics::System.monotonic_time
Vulnerability.dismissed.where('dismissed_at IS NULL OR dismissed_by_id IS NULL').each_batch(of: 100) do |batch|
batch.pluck(:id)
end
diff = Gitlab::Metrics::System.monotonic_time - start
puts diff # => 0.8272816969547421
Here are the full set of queries
D, [2021-03-25T10:55:37.053021 #27837] DEBUG -- : Vulnerability Load (2.3ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) ORDER BY "vulnerabilities"."id" ASC LIMIT 1
D, [2021-03-25T10:55:37.080241 #27837] DEBUG -- : Vulnerability Load (2.3ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 1406747 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.083941 #27837] DEBUG -- : (2.4ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 1406747 AND "vulnerabilities"."id" < 2135692
D, [2021-03-25T10:55:37.088253 #27837] DEBUG -- : Vulnerability Load (3.3ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2135692 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.090572 #27837] DEBUG -- : (1.3ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2135692 AND "vulnerabilities"."id" < 2847317
D, [2021-03-25T10:55:37.092652 #27837] DEBUG -- : Vulnerability Load (1.2ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2847317 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.095150 #27837] DEBUG -- : (1.7ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2847317 AND "vulnerabilities"."id" < 2848959
D, [2021-03-25T10:55:37.096652 #27837] DEBUG -- : Vulnerability Load (0.6ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2848959 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.098329 #27837] DEBUG -- : (0.9ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2848959 AND "vulnerabilities"."id" < 2851660
D, [2021-03-25T10:55:37.100273 #27837] DEBUG -- : Vulnerability Load (1.2ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2851660 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.101651 #27837] DEBUG -- : (0.7ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2851660 AND "vulnerabilities"."id" < 3395455
D, [2021-03-25T10:55:37.104311 #27837] DEBUG -- : Vulnerability Load (1.9ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 3395455 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.105703 #27837] DEBUG -- : (0.7ms) /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 3395455
Query plans
All the execution plans are with a cold cache.
The query to load the first ID of the vulnerabilities
SELECT
"vulnerabilities"."id"
FROM
"vulnerabilities"
WHERE
"vulnerabilities"."state" = 2
AND (dismissed_at IS NULL
OR dismissed_by_id IS NULL)
ORDER BY
"vulnerabilities"."id" ASC
LIMIT 1
Limit (cost=0.27..0.32 rows=1 width=8) (actual time=6.656..6.657 rows=1 loops=1)
Buffers: shared read=4 dirtied=1
I/O Timings: read=5.674
-> Index Only Scan using temporary_index_vulnerabilities_on_id on public.vulnerabilities (cost=0.27..18467.49 rows=379891 width=8) (actual time=6.653..6.654 rows=1 loops=1)
Heap Fetches: 1
Buffers: shared read=4 dirtied=1
I/O Timings: read=5.674
The query to load the IDs of vulnerability with pluck
SELECT
"vulnerabilities"."id"
FROM
"vulnerabilities"
WHERE
"vulnerabilities"."state" = 2
AND (dismissed_at IS NULL
OR dismissed_by_id IS NULL)
AND "vulnerabilities"."id" >= 1406747
ORDER BY
"vulnerabilities"."id" ASC
LIMIT 1 OFFSET 100
Limit (cost=5.88..5.94 rows=1 width=8) (actual time=16.725..16.727 rows=1 loops=1)
Buffers: shared hit=17 read=15 dirtied=7
I/O Timings: read=15.460
-> Index Only Scan using temporary_index_vulnerabilities_on_id on public.vulnerabilities (cost=0.27..17370.27 rows=309802 width=8) (actual time=0.099..16.710 rows=101 loops=1)
Index Cond: (vulnerabilities.id >= 1406747)
Heap Fetches: 15
Buffers: shared hit=17 read=15 dirtied=7
I/O Timings: read=15.460
The query to update the records with vulnerability IDs
UPDATE
vulnerabilities
SET
dismissed_at = updated_at,
dismissed_by_id = COALESCE(updated_by_id, last_edited_by_id, author_id)
WHERE
vulnerabilities.id IN (...) --- Used exactly 100 IDs returned by the previous queries but removed here to make it easier to read
ModifyTable on public.vulnerabilities (cost=0.43..321.46 rows=100 width=296) (actual time=460.749..460.750 rows=0 loops=1)
Buffers: shared hit=6436 read=371 dirtied=216
I/O Timings: read=441.564
-> Index Scan using vulnerabilities_pkey on public.vulnerabilities (cost=0.43..321.46 rows=100 width=296) (actual time=12.641..119.134 rows=100 loops=1)
Index Cond: (vulnerabilities.id = ANY ('{...}'::bigint[]))
Buffers: shared hit=302 read=98
I/O Timings: read=117.660
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. - [-] I have not included a changelog entry because _____.
-
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mehmet Emin INAC