Proposal: Remove all single-node zero downtime instructions
What does this MR do?
With only Puma carried in GitLab 14.0 onwards there is no possibility of minimizing any downtime in single-node upgrades.
When the migration is run while leaving the previous version's puma running, the database changes are picked up in Rails dynamically and UI errors begin to be thrown to users.
Given the disruption of features is essentially the same to users as a downtime when it affects critical features such as merge requests the section is false in indicating that downtime is minimized.
Additionally this section is often discovered by users and blindly followed even for multiple version jumps that leads to broken upgrades.
This change removes all sections related to zero downtime upgrade instructions of single nodes.
Related issues
This is a non-exhaustive list of recent customer support tickets, all stemming from customers discovering and following the single node zero-downtime instructions that are unnecessarily complex for their setup and error-prone with no visible benefits:
-
https://gitlab.zendesk.com/agent/tickets/272336
- Customer had the single node zero-downtime commands copied into their runbook and ran into unrecoverable errors when following the upgrade path
-
https://gitlab.zendesk.com/agent/tickets/272297
- Customer here followed zero downtime upgrade instructions for single node but jumped two versions, and ran into the expected
NoMethodError
that occurs immediately as migrations apply
- Customer here followed zero downtime upgrade instructions for single node but jumped two versions, and ran into the expected
-
https://gitlab.zendesk.com/agent/tickets/269083
- Customer performed single node zero downtime steps, ran into
NoMethodError
because they missed the additional restart required at the end after the migrations have run
- Customer performed single node zero downtime steps, ran into
-
https://gitlab.zendesk.com/agent/tickets/266806
- This ticket too shows a presence of software running while the migration changes things that the backend cannot cope up with without a restart, something that occurs only when these single node instructions are followed
-
https://gitlab.zendesk.com/agent/tickets/264899
- Similar to one of the above ones, a new column introduced in DB is dynamically picked up while migrations run, and break functionality on the single node during these upgrade instructions
-
https://gitlab.zendesk.com/agent/tickets/237287
- Yet another ticket where the unnecessarily complicated steps for a single node upgrade resulted in an unusable state and caused confusion to users, and all it took was a full restart to resolve
-
https://gitlab.zendesk.com/agent/tickets/189923
- Ditto to the above ticket, the verbose upgrade steps ended up needing further commands to complete but it wasn't clear as the cause
An observable theme in all these related tickets:
- Administrators keep GitLab running for users and users experience functionality errors, generating more reports internally for the administrators to deal with
- Support engineers evaluate the ticket from scratch since it isn't immediately clear if the upgraded version has introduced a new bug that they'll need to understand or discover reports of or if the upgrade instructions followed were at fault
Tangentially related to feature request #356636
Author's checklist
-
Optional. Consider taking the GitLab Technical Writing Fundamentals course. -
Follow the: -
If you're adding or changing the main heading of the page (H1), ensure that the product tier badge is added. -
If you are a GitLab team member, request a review based on: - The documentation page's metadata.
- The associated Technical Writer.
If you are a GitLab team member and only adding documentation, do not add any of the following labels:
~"frontend"
~"backend"
~"type::bug"
~"database"
These labels cause the MR to be added to code verification QA issues.
Reviewer's checklist
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
-
If the content requires it, ensure the information is reviewed by a subject matter expert. - Technical writer review items:
-
Ensure docs metadata is present and up-to-date. -
Ensure the appropriate labels are added to this MR. -
Ensure a release milestone is set. - If relevant to this MR, ensure content topic type principles are in use, including:
-
The headings should be something you'd do a Google search for. Instead of Default behavior
, say something likeDefault behavior when you close an issue
. -
The headings (other than the page title) should be active. Instead of Configuring GDK
, say something likeConfigure GDK
. -
Any task steps should be written as a numbered list. - If the content still needs to be edited for topic types, you can create a follow-up issue with the docs-technical-debt label.
-
-
-
Review by assigned maintainer, who can always request/require the reviews above. Maintainer's review can occur before or after a technical writer review.