Rake Task to truncate Legacy tables on Main and CI Database
What does this MR do and why?
This MR introduced Rake task, to truncate the rest of the CI tables on the Main database, and the rest of the Main tables on the CI database.
How do we truncate tables in this rake task?
Let's say that we want to truncate tables (table1, table2, table3, ... table9). We start by truncating the tables [table1, table2, table3], which don't depend on the other tables. Then we truncate the next 3 tables [table4, table5, table6]. But we still have to include the previously truncated 3 tables in the Truncate SQL statement, even if the tables have been truncated already.
We do the truncation in stages, to avoid any IO problems. Because the tables are huge. It might take 1 - 2 minutes to truncate some tables on Production. See here: gitlab-com/gl-infra/production#7511 (comment 1054799933)
How to set up and validate locally
-
Make sure you are running in Decomposed Database Setup, which is the default option in GDK now. #356353 (closed)
-
Lock the CI tables on the Main database, and the Main tables on the CI Database, using the rake task
rake gitlab:db:lock_writes
Make sure that you cannot truncate a locked table using PSQL
gdk psql -c "truncate ci_pipelines"
ERROR: cannot truncate a table referenced in a foreign key constraint
DETAIL: Table "ci_builds" references "ci_pipelines".
HINT: Truncate table "ci_builds" at the same time, or use TRUNCATE ... CASCADE.
- Now run the rake tasks to truncate the tables in
DRY_RUN
mode, see the commands that are going to run on both databases to truncate the tables.
DRY_RUN=true rake gitlab:db:truncate_legacy_tables:main
This will show truncation stages. In each stage it will truncate a new set of tables + all the tables that have been truncated in the previous stages. You can see the truncated tables in each stage with the message New tables to truncate:
Let's say that we want to truncate only the first 3 stages (around 15 tables). We can run the command
DRY_RUN=true UNTIL_TABLE=ci_unit_test_failures rake gitlab:db:truncate_legacy_tables:main
Remove the DRY_RUN=true
to do the actual truncation.
If you don't specify UNTIL_TABLE
, it will truncate all the tables. But 5 tables in each stage. To avoid truncating all the tables within 1 transaction.
If you wnat to change the number of the tables in each stage, you can pass the parameter to the rake task, for example:
DRY_RUN=true rake gitlab:db:truncate_legacy_tables:main[\30\]
This will truncate 30 new tables at least in each stage.
- The same for the Main Tables on the CI database
DRY_RUN=true rake gitlab:db:truncate_legacy_tables:ci
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #366787 (closed)