Backfill LfsObjectsProject records for forks
What does this MR do?
Background migration has been added to link all LFS objects of a project. It gets scheduled via a post-deploy migration which queries all forks with LFS enabled.
The background migration will get all LFS pointers for each fork to get the OIDs of LFS objects needed to be linked via Gitaly.
Post migration query
Sample (has LIMIT
because we're using each_batch
):
SELECT projects.id
FROM projects
INNER JOIN fork_network_members ON fork_network_members.project_id = projects.id
WHERE (projects.lfs_enabled = TRUE OR projects.lfs_enabled IS NULL)
AND (fork_network_members.forked_from_project_id IS NOT NULL)
LIMIT 1000
Query plan: https://explain.depesz.com/s/KaAC
Background migration schedule
Background migration jobs will be enqueued with 1000 projects each. Each job will be enqueued using the formula:
index (zero based) * batch size * interval
Where:
-
index
- zero based. This is the index of each batch. -
batch size
- size of the batch divided by concurrency rate (BATCH_SIZE
is1000
andCONCURRENCY
is 4, so this can be <= 250) -
interval
- 30 seconds. Based on the 99th percentile latency ofGetAllLfsPointers
calls.
The first 4 jobs will be enqueued and worked on immediately while the next 4 jobs will be enqueued but will be worked on after 2 hours. Each job will process each project sequentially. Decided to go with this based on this: !24164 (comment 282672404).
Wanted to bulk enqueue smaller individual jobs (with #push_bulk
) with different schedules instead of big jobs but our version of Sidekiq doesn't support it (support was added in 6.0.1 and we're using 5.2.7). We can enqueue multiple jobs using #perform_in
but that'll be n+1 requests to redis. So I opted with this approach.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team