WIP: Backfill LfsObjectsProject records of forks
What does this MR do?
To fix the behavior of existing forks, LfsObjectsProject records need to be backfilled. This is so we can remove the need to depend on the source when looking for LFS objects from forks.
This is based on implementation in !25343 (merged) that was reverted. The difference is the query used for finding the projects to be migrated is now performant.
Migration
Output:
== 20200306134708 RescheduleLinkLfsObjects: migrating =========================
== 20200306134708 RescheduleLinkLfsObjects: migrated (37.6010s) ===============
Tested sample queries on #database-lab. Here's the test data that I added on top of existing data:
/* Create source project */
INSERT INTO projects (id, namespace_id, name, archived, created_at, updated_at) VALUES (100000001, 9970, 'pb-lfs-backfill-source', false, NOW(), NOW());
/* Create forks */
INSERT INTO projects (id, namespace_id, name, archived, created_at, updated_at) SELECT n, 2327904, 'pb-lfs-backfill-fork', false, NOW(), NOW() FROM generate_series(100000002, 100100001) AS n;
/* Create fork networks and members */
INSERT INTO fork_networks (id, root_project_id) VALUES (100000001, 100000001);
INSERT INTO fork_network_members (fork_network_id, project_id) VALUES (100000001, 100000001);
INSERT INTO fork_network_members (fork_network_id, forked_from_project_id, project_id) SELECT 100000001, 100000001, projects.id FROM projects WHERE name = 'pb-lfs-backfill-fork';
/* Create `LfsObjectsProject` records for source project */
INSERT INTO lfs_objects_projects (id, lfs_object_id, project_id, created_at, updated_at) SELECT n, n, 100000001, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000002, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000003, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000004, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000005, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000006, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000007, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000008, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000009, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000010, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
ANALYZE projects;
ANALYZE fork_network_members;
ANALYZE lfs_objects_projects;
This creates 100k forks and 1M lfs_objects_projects
records. Query and plans are added to corresponding lines.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖