Move a group to a new shard
POC
Problem to solve
We know that we need to move top-level namespaces to new shards either for rebalancing or to move them from the current database. We should show how it would be possible to move a single group to a new database
Solution
Use PG logical replication to replicate a group (top-level) namespace? To a new shard
The preferred algorithm is described in #329308 (closed) . This issue should focus on the Postgres parts of this algorithm. The biggest risk (I see) is going to be features that are missing in Postgres so we should de-risk by trying to implement the Postgres parts quickly.
Technical steps
This algorithm is almost entirely described (but for MySQL) in https://www.usenix.org/conference/srecon19emea/presentation/li . Postgres specifics were discussed in this call https://youtu.be/0GtMDSKMCd4 and most details are described in https://paquier.xyz/postgresql-2/postgres-9-5-feature-highlight-pg-dump-snapshots/ .
The logic for moving group-1
from shard-0
to shard-1
will be:
- GitLab starts a
GroupShardMoveWorker
- Configure Postgres to replicate all data belonging to
group-1
fromshard-0
toshard-1
- Such data will be
WHERE namespace_id IN (...all subgroups of group-1 and group-1 ids) OR project_id IN (...all projects in group-1 and it's subgroups ids)
(see universal sharding IDs)
- Such data will be
- Postgres will do an initial copy of all data plus a stream of all updates
- Following https://paquier.xyz/postgresql-2/postgres-9-5-feature-highlight-pg-dump-snapshots/ our initial copy will need to first create a logical replication slot and get a snapshot ID
- The snapshot ID will then need to be used to generate the initial data copy, carefully not losing the initial connection before starting the new transaction with this snapshot ID
- The copy can use any
SELECT
orpg_dump
or any other normal Postgres queries as it is just in a normal Postgres transaction
- GitLab waits until the stream is almost caught up (some threshold of 1s lag should be fine)
-
GroupShardMoveWorker
acquires an exclusive lock forgroup-1
(see Shared/Exclusive locking -
GroupShardMoveWorker
waits until the stream of updates is empty (writing is paused so this should take around 1s) - (optional but advisable)
GroupShardMoveWorker
does a validation check (checksum of all rows/columns in single number to compareshard-0
withshard-1
)- As described in the linked presentation the most likely cause of validation failure will be schema changes that happened during the move. Given the frequency of GitLab deployments and migrations we may want an automated way to coordinate such that we never try moving a group while migrations are in progress. Or we abort a failed move and just retry it later.
- Sharding tables are updated to reflect
group-1
now belongs toshard-1
-
GroupShardMoveWorker
releases exclusive lock forgroup-1
Resources
- https://www.usenix.org/conference/srecon19emea/presentation/li
-
https://paquier.xyz/postgresql-2/postgres-9-5-feature-highlight-pg-dump-snapshots/
- Video with similar content https://www.youtube.com/watch?v=gPqMUfYFBLs but earlier
- Understanding Logical Decoding and Replication - Michael Paquier
- Discuss Postgres Replication For Moving Between Shards - @DylanGriffith/@NikolayS
Confidence of due date May 12 => 80%