Delete security_finding records with missing UUIDs
What does this MR do?
This MR introduces a database migration to delete the records without UUIDs from the security_findings
table.
Related to #323577 (closed).
Database review
This migration will delete 104_302 records from the security_findings
table. I've tested the migration without the delete statement on production replica and it took less than a second to finish. Therefore, I've implemented inline migration instead of a background migration.
The script ran on prod replica;
class SecurityFinding < ActiveRecord::Base
include EachBatch
self.table_name = 'security_findings'
scope :without_uuid, -> { where(uuid: nil) }
end
start_time = Gitlab::Metrics::System.monotonic_time
SecurityFinding.without_uuid.each_batch(of: 10_000) do |relation, index|
relation.pluck(:id)
end
time_diff = Gitlab::Metrics::System.monotonic_time - start_time
puts time_diff # 0.715035862987861
There are 3 types of SQL queries used by this migration;
Returns the ID of the first security finding record without UUID;
SELECT
"security_findings"."id"
FROM
"security_findings"
WHERE
"security_findings"."uuid" IS NULL
ORDER BY
"security_findings"."id" ASC
LIMIT 1
Limit (cost=0.57..49.38 rows=1 width=8) (actual time=0.063..0.064 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan using security_findings_pkey on public.security_findings (cost=0.57..5137091.44 rows=105252 width=8) (actual time=0.062..0.062 rows=1 loops=1)
Filter: (security_findings.uuid IS NULL)
Rows Removed by Filter: 0
Buffers: shared hit=5
Returns the ID of the 10_000th security finding without UUID;
SELECT
"security_findings"."id"
FROM
"security_findings"
WHERE
"security_findings"."uuid" IS NULL
AND "security_findings"."id" >= 989
ORDER BY
"security_findings"."id" ASC
LIMIT 1 OFFSET 10000
Limit (cost=106593.74..106593.85 rows=1 width=8) (actual time=102.333..109.905 rows=1 loops=1)
Buffers: shared hit=9001 dirtied=1
-> Gather Merge (cost=105426.99..115660.54 rows=87710 width=8) (actual time=84.394..108.730 rows=10001 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=9001 dirtied=1
-> Sort (cost=104426.97..104536.60 rows=43855 width=8) (actual time=70.075..71.922 rows=4249 loops=3)
Sort Key: security_findings.id
Sort Method: top-N heapsort Memory: 1301kB
Buffers: shared hit=9001 dirtied=1
-> Parallel Index Scan using tmp_index_on_security_findings_scan_id on public.security_findings (cost=0.42..101294.00 rows=43855 width=8) (actual time=0.025..41.378 rows=34653 loops=3)
Filter: (security_findings.id >= 989)
Rows Removed by Filter: 114
Buffers: shared hit=8941 dirtied=1
Deletes the security finding records without UUID;
DELETE FROM "security_findings"
WHERE "security_findings"."uuid" IS NULL
AND "security_findings"."id" >= 989
AND "security_findings"."id" < 99900
ModifyTable on public.security_findings (cost=1157.19..1172.41 rows=10 width=6) (actual time=49.964..49.967 rows=0 loops=1)
Buffers: shared hit=2922 read=30 dirtied=172
I/O Timings: read=36.014
-> Bitmap Heap Scan on public.security_findings (cost=1157.19..1172.41 rows=10 width=6) (actual time=45.371..46.403 rows=2274 loops=1)
Buffers: shared hit=471 read=30
I/O Timings: read=36.014
-> BitmapAnd (cost=1157.19..1157.19 rows=10 width=0) (actual time=45.315..45.316 rows=0 loops=1)
Buffers: shared hit=294 read=30
I/O Timings: read=36.014
-> Bitmap Index Scan using security_findings_pkey (cost=0.00..195.26 rows=7019 width=0) (actual time=36.562..36.562 rows=2274 loops=1)
Index Cond: ((security_findings.id >= 989) AND (security_findings.id < 99900))
Buffers: shared hit=6 read=30
I/O Timings: read=36.014
-> Bitmap Index Scan using tmp_index_on_security_findings_scan_id (cost=0.00..961.68 rows=105252 width=0) (actual time=8.691..8.691 rows=104302 loops=1)
Buffers: shared hit=288
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mehmet Emin INAC