Implement background migration to create remediations
What does this MR do?
This MR introduces a new background migration to create Vulnerabilities::Remediation
entries in the database.
Related to #239177 and #239183
Database review
Scheduler post database migration
The scheduler migration will schedule background jobs by using the ProjectSetting.has_vulnerabilities
scope. Currently, this scope returns 12751 projects so having the batch count set as 10 means that we will have 1276 jobs enqueued after the migration runs. We already have a proper index created for the ProjectSetting.has_vulnerabilities
scope and the execution of this migration shouldn't take more than 10 seconds(Dry run on prod console takes around 2 seconds without the Redis access).
The query to fetch the initial project ID used by the each_batch
;
SELECT
"project_settings"."project_id"
FROM
"project_settings"
WHERE (has_vulnerabilities IS TRUE)
ORDER BY
"project_settings"."project_id" ASC
LIMIT 1
And the execution plan;
Limit (cost=0.29..0.31 rows=1 width=4) (actual time=2.781..2.783 rows=1 loops=1)
Buffers: shared read=3
I/O Timings: read=2.652
-> Index Only Scan using index_project_settings_on_project_id_partially on public.project_settings (cost=0.29..253.58 rows=11186 width=4) (actual time=2.780..2.780 rows=1 loops=1)
Heap Fetches: 0
Buffers: shared read=3
I/O Timings: read=2.652
The query to fetch the upper bound of each batch;
SELECT
"project_settings"."project_id"
FROM
"project_settings"
WHERE (has_vulnerabilities IS TRUE)
AND "project_settings"."project_id" >= 1
ORDER BY
"project_settings"."project_id" ASC
LIMIT 1 OFFSET 10
And the execution plan;
Limit (cost=0.54..0.56 rows=1 width=4) (actual time=0.969..0.970 rows=1 loops=1)
Buffers: shared hit=4 read=4
I/O Timings: read=0.895
-> Index Only Scan using index_project_settings_on_project_id_partially on public.project_settings (cost=0.29..281.54 rows=11186 width=4) (actual time=0.196..0.967 rows=11 loops=1)
Index Cond: (project_settings.project_id >= 1)
Heap Fetches: 0
Buffers: shared hit=4 read=4
I/O Timings: read=0.895
Sidekiq background jobs
The background jobs are intentionally designed to run for projects and not for the findings. The reason is, it is possible to have different findings having the same remediation so if we iterate finding by finding, we can not optimize the remediation creation process but if we process a whole project at once, we can minimize the database queries needed to create the remediations.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team