Proposal: Always expect no-downtime migrations
Context
A downtime check process introduced 4 years ago allows for migrations to take GitLab installations offline. gitlab-foss!4911 (merged) / gitlab-foss#14545 (closed). From the issue, the goals of allowing downtime in migrations were:
- To help Release Managers identify which migrations require downtime, and,
- To notify the developers when a change might require downtime.
From a quick git search, the last migration that require downtime was introduced 3 years ago.
Problem
In our current development/infrastructure setup, we don't have any strategy in place if a migration requires downtime:
- If a merge request with
DOWNTIME = true
is opened CI pipelines don't fail.- There's also no notification by Danger or any bot about this change. I tested this on !58128 (comment 542355682)
- Release Managers no longer verify if a migration requires downtime, they assume all migrations are zero-downtime migrations.
- When deploying to GitLab.com, the deployer pipeline doesn't consider this into account.
- Furthermore we have several strategies in place that help us to have zero-downtime migrations
With our continuous delivery model, the MTTP and GitLab.com availability goals, there's shouldn't be any case for a migration to require downtime for GitLab.com nor for self-hosted instances.
Proposal
Remove the checking around DOWNTIME=true
, the constant from the migration template and also update the documentation to state that all migrations should be zero-downtime migrations
-
Remove DowntimeCheck
rake task -
Remove the rake task check from CI -
Remove the DOWNTIME
constant from the template used to generate migrations -
Update development docs to make it clear we don't allow downtime and remove mention of approval process for downtime -
announce this in #backend, #development, #quality and add an item about this in the Engineering Week in Review.