Add rake task for migration legacy pages storage
What does this MR do?
Related to #282466 (closed)
This rake task just uses service introduced in !49473 (merged) (this MR currently targets this branch).
We plan to run this task on .com soon.
It probably will go slow and we will want to optimize it, e.g. migrate some projects in parallel. But we want to try running if first and see how it goes.
This task can be stopped/started at any time: it will just continue to handle projects which aren't migrated yet.
Database queries
Thanks to @krasio:
The execution plans doesn't look too bad - we do not have the best possible index but given we are working in small batches (1_000) timings are OK. We may need to add a dedicated index (WHERE deployed = true AND pages_deployment_id IS NULL
) when we want to switch to background migration (which we'll have to do eventually for self-managed).
1. Batch 1 - start id
SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1
https://postgres.ai/console/shared/e673c237-7a0a-486f-9912-1939e7c2b26e
gitlabhq_production=> SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1;
project_id
------------
179
(1 row)
Time: 4.564 ms
2. Batch 1 - end id
SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 179 ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1 OFFSET 1000
https://postgres.ai/console/shared/abd55983-015c-43cd-8780-58ddc3436bd7
gitlabhq_production=> SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 179 ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1 OFFSET 1000;
project_id
------------
1015390
(1 row)
Time: 262.336 ms
3. Batch 1 - SELECT * for batch
SELECT "project_pages_metadata".* FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 179 AND "project_pages_metadata"."project_id" <= 1015390
https://postgres.ai/console/shared/1e893edc-c913-4402-80ae-fd30de93f17e
4. Batch 2 - end id
SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 1015390 ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1 OFFSET 1000
https://postgres.ai/console/shared/9bbff61d-0c51-44c5-b8e8-b6eeba76c238
gitlabhq_production=> SELECT "project_pages_metadata"."project_id" FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 1015390 ORDER BY "project_pages_metadata"."project_id" ASC LIMIT 1 OFFSET 1000;
project_id
------------
1153397
(1 row)
Time: 43.223 ms
5. Batch 2 - SELECT * for batch
SELECT "project_pages_metadata".* FROM "project_pages_metadata" WHERE "project_pages_metadata"."deployed" = TRUE AND "project_pages_metadata"."pages_deployment_id" IS NULL AND "project_pages_metadata"."project_id" >= 1015390 AND "project_pages_metadata"."project_id" <= 1153397
https://postgres.ai/console/shared/ce631e47-6721-4b15-8554-eff84411d972
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team