Skip to content

Enable removing import data on failure by default

Igor Drozdov requested to merge id-enable-removing-import-data-on-failure into master

What does this MR do and why?

Related issue: #352156 (closed)

When import is failed, there is no need in collecting their import data. This MR enables the functionality by default, it has been enabled globally for a while now.

Project import data for the failed imports are being removed. This is the query and the results of getting last week data:

EXPLAIN SELECT "project_mirror_data".* FROM "project_mirror_data" INNER JOIN
"projects" "project" ON "project"."id" = "project_mirror_data"."project_id"
INNER JOIN "project_import_data" ON "project_import_data"."project_id" =
"project"."id" WHERE "project_mirror_data"."status" = 'failed' AND
"project_mirror_data"."last_update_scheduled_at" > '2022-02-15 17:52:50.670114'
AND "project"."mirror" = false 

https://files.slack.com/files-pri/T02592416-F0349TSCXV1/plan-text.txt

 Gather  (cost=1001.54..88293.81 rows=154 width=269) (actual time=20069.690..20069.938 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=25045 read=62580 dirtied=4163
   I/O Timings: read=59219.539 write=0.000
   ->  Nested Loop  (cost=1.54..87278.41 rows=64 width=269) (actual time=20023.972..20023.975 rows=0 loops=3)
         Buffers: shared hit=25045 read=62580 dirtied=4163
         I/O Timings: read=59219.539 write=0.000
         ->  Nested Loop  (cost=0.98..86712.37 rows=284 width=273) (actual time=11369.826..18766.191 rows=1062 loops=3)
               Buffers: shared hit=12867 read=58819 dirtied=4021
               I/O Timings: read=55517.442 write=0.000
               ->  Parallel Index Scan using index_project_mirror_data_on_status on public.project_mirror_data  (cost=0.56..83243.36 rows=2384 width=269) (actual time=4786.758..18529.918 rows=1235 loops=3)
                     Index Cond: ((project_mirror_data.status)::text = 'failed'::text)
                     Filter: (project_mirror_data.last_update_scheduled_at > '2022-02-15 17:52:50.670114'::timestamp without time zone)
                     Rows Removed by Filter: 68355
                     Buffers: shared hit=452 read=58054 dirtied=3924
                     I/O Timings: read=54873.931 write=0.000
               ->  Index Only Scan using index_project_import_data_on_project_id on public.project_import_data  (cost=0.42..1.45 rows=1 width=4) (actual time=0.187..0.188 rows=1 loops=3704)
                     Index Cond: (project_import_data.project_id = project_mirror_data.project_id)
                     Heap Fetches: 2050
                     Buffers: shared hit=12415 read=765 dirtied=97
                     I/O Timings: read=643.511 write=0.000
         ->  Index Scan using projects_pkey on public.projects project  (cost=0.56..1.98 rows=1 width=4) (actual time=1.183..1.183 rows=0 loops=3186)
               Index Cond: (project.id = project_import_data.project_id)
               Filter: (NOT project.mirror)
               Rows Removed by Filter: 1
               Buffers: shared hit=12178 read=3761 dirtied=142
               I/O Timings: read=3702.098 write=0.000
Edited by Igor Drozdov

Merge request reports

Loading