Change JiraTrackerData#deployment_type based on URL
What does this MR do?
After fixing #324251 (closed), we had a customer report that their JIRA integration stopped working. Sentry event confirmed this to come from the fact that the project's corresponding JiraService
's JiraTrackerData#deployment_type
was set to 'unknown'
. This background migration will set deployment_type
for rows where deployment_type
is unknown
based on the url
field.
We have around 90k records for this table where deployment_type is unknown. The logic is simple – if URL ends in atlassian.net
then most likely we're dealing with JIRA Cloud, otherwise it's JIRA Server.
Related #329530 (closed)
Migrations
bundle exec rails db:migrate
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: migrating
-- Scheduling UpdateJiraTrackerDataDeploymentTypeBasedOnUrl jobs
-- Scheduled 1 UpdateJiraTrackerDataDeploymentTypeBasedOnUrl jobs with a maximum of 2500 records per batch and an interval of 120 seconds.
The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-05-20 10:14:49 UTC."
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: migrated (0.0240s)
bundle exec rails db:migrate:down VERSION=20210421163509
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: reverting
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: reverted (0.0000s)
Timings
Selecting a batch
https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4132/commands/14399
Query
SELECT * FROM jira_tracker_data WHERE deployment_type = 0 AND id BETWEEN 1 AND 2500
Plan
Index Scan using jira_tracker_data_pkey on public.jira_tracker_data (cost=0.42..1444.04 rows=1139 width=340) (actual time=49.666..700.757 rows=21 loops=1)
Index Cond: ((jira_tracker_data.id >= 1) AND (jira_tracker_data.id <= 2500))
Filter: (jira_tracker_data.deployment_type = 0)
Rows Removed by Filter: 2189
Buffers: shared hit=568 read=838 dirtied=99
I/O Timings: read=687.760 write=0.000
Timings
Time: 707.502 ms
- planning: 6.678 ms
- execution: 700.824 ms
- I/O read: 687.760 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 568 (~4.40 MiB) from the buffer pool
- reads: 838 (~6.50 MiB) from the OS file cache, including disk I/O
- dirtied: 99 (~792.00 KiB)
- writes: 0
Updating a batch
Batch size of 2500: https://explain.depesz.com/s/tFlo (thanks @abrandl)
This is with a batch size of 250 – I can't get #database-lab to swallow a query with 2500 pairs that BulkUpdate
generates
https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4132/commands/14414
Query
EXPLAIN WITH cte(cte_id, cte_deployment_type) AS MATERIALIZED (
VALUES (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15, 1), (16, 1), (17, 1), (18, 1), (19, 1), (20, 1), (21, 1), (22, 1), (23, 1), (24, 1), (25, 1), (26, 1), (27, 1), (28, 1), (29, 1), (30, 1), (31, 1), (32, 1), (33, 1), (34, 1), (35, 1), (36, 1), (37, 1), (38, 1), (39, 1), (40, 1), (41, 1), (42, 1), (43, 1), (44, 1), (45, 1), (46, 1), (47, 1), (48, 1), (49, 1), (50, 1), (51, 1), (52, 1), (53, 1), (54, 1), (55, 1), (56, 1), (57, 1), (58, 1), (59, 1), (60, 1), (61, 1), (62, 1), (63, 1), (64, 1), (65, 1), (66, 1), (67, 1), (68, 1), (69, 1), (70, 1), (71, 1), (72, 1), (73, 1), (74, 1), (75, 1), (76, 1), (77, 1), (78, 1), (79, 1), (80, 1), (81, 1), (82, 1), (83, 1), (84, 1), (85, 1), (86, 1), (87, 1), (88, 1), (89, 1), (90, 1), (91, 1), (92, 1), (93, 1), (94, 1), (95, 1), (96, 1), (97, 1), (98, 1), (99, 1), (100, 1), (101, 1), (102, 1), (103, 1), (104, 1), (105, 1), (106, 1), (107, 1), (108, 1), (109, 1), (110, 1), (111, 1), (112, 1), (113, 1), (114, 1), (115, 1), (116, 1), (117, 1), (118, 1), (119, 1), (120, 1), (121, 1), (122, 1), (123, 1), (124, 1), (125, 1), (126, 1), (127, 1), (128, 1), (129, 1), (130, 1), (131, 1), (132, 1), (133, 1), (134, 1), (135, 1), (136, 1), (137, 1), (138, 1), (139, 1), (140, 1), (141, 1), (142, 1), (143, 1), (144, 1), (145, 1), (146, 1), (147, 1), (148, 1), (149, 1), (150, 1), (151, 1), (152, 1), (153, 1), (154, 1), (155, 1), (156, 1), (157, 1), (158, 1), (159, 1), (160, 1), (161, 1), (162, 1), (163, 1), (164, 1), (165, 1), (166, 1), (167, 1), (168, 1), (169, 1), (170, 1), (171, 1), (172, 1), (173, 1), (174, 1), (175, 1), (176, 1), (177, 1), (178, 1), (179, 1), (180, 1), (181, 1), (182, 1), (183, 1), (184, 1), (185, 1), (186, 1), (187, 1), (188, 1), (189, 1), (190, 1), (191, 1), (192, 1), (193, 1), (194, 1), (195, 1), (196, 1), (197, 1), (198, 1), (199, 1), (200, 1), (201, 1), (202, 1), (203, 1), (204, 1), (205, 1), (206, 1), (207, 1), (208, 1), (209, 1), (210, 1), (211, 1), (212, 1), (213, 1), (214, 1), (215, 1), (216, 1), (217, 1), (218, 1), (219, 1), (220, 1), (221, 1), (222, 1), (223, 1), (224, 1), (225, 1), (226, 1), (227, 1), (228, 1), (229, 1), (230, 1), (231, 1), (232, 1), (233, 1), (234, 1), (235, 1), (236, 1), (237, 1), (238, 1), (239, 1), (240, 1), (241, 1), (242, 1), (243, 1), (244, 1), (245, 1), (246, 1), (247, 1), (248, 1), (249, 1), (250, 1)
) UPDATE jira_tracker_data SET deployment_type = cte.cte_deployment_type FROM cte WHERE cte_id = id;
Plan
ModifyTable on public.jira_tracker_data (cost=3.54..829.13 rows=250 width=410) (actual time=41.433..41.435 rows=0 loops=1)
Buffers: shared hit=2983 read=56 dirtied=224
I/O Timings: read=27.465 write=0.000
CTE cte
-> Values Scan on "*VALUES*" (cost=0.00..3.12 rows=250 width=8) (actual time=0.002..0.250 rows=250 loops=1)
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=0.42..826.00 rows=250 width=410) (actual time=0.067..7.001 rows=227 loops=1)
Buffers: shared hit=981
I/O Timings: read=0.000 write=0.000
-> CTE Scan on cte (cost=0.00..5.00 rows=250 width=40) (actual time=0.010..0.485 rows=250 loops=1)
I/O Timings: read=0.000 write=0.000
-> Index Scan using jira_tracker_data_pkey on public.jira_tracker_data (cost=0.42..3.28 rows=1 width=344) (actual time=0.025..0.025 rows=1 loops=250)
Index Cond: (jira_tracker_data.id = cte.cte_id)
Buffers: shared hit=981
I/O Timings: read=0.000 write=0.000
Summary
Time: 42.116 ms
- planning: 0.594 ms
- execution: 41.522 ms
- I/O read: 27.465 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 2983 (~23.30 MiB) from the buffer pool
- reads: 56 (~448.00 KiB) from the OS file cache, including disk I/O
- dirtied: 224 (~1.80 MiB)
- writes: 0
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team