Batched background migrations - Add steal method
Context: Gitlab Customers are reporting that they cannot upgrade their Gitlab instances because, during the upgrade, they receive an error message raised by a rails migration.
Error message example:
== 20220318111040 AddIndexesForPrimaryEmailSecondCleanupMigration: migrating ==
-- indexes(:users)
-> 0.0213s
-- current_schema()
-> 0.0006s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- execute("CREATE INDEX CONCURRENTLY index_users_on_id_for_primary_email_migration\nON users (id) INCLUDE (email, confirmed_at)\nWHERE confirmed_at IS NOT NULL\n")
-> 0.0061s
-- execute("RESET statement_timeout")
-> 0.0007srake aborted!
StandardError: An error has occurred, this and all later migrations canceled:
Expected batched background migration for the given configuration to be marked as 'finished', but it is 'active': {:job_class_name=>"ProjectNamespaces::BackfillProjectNamespaces", :table_name=>:projects, :column_name=>:id, :job_arguments=>[nil, "up"]}
Finalize it manualy by running
sudo gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]']
For more information, check the documentation
https://docs.gitlab.com/ee/user/admin_area/monitoring/background_migrations.html#database-migrations-failing-because-of-batched-background-migration-not-finished
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:960:in `ensure_batched_background_migration_is_finished'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20220322071127_finalize_project_namespaces_backfill.rb:10:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:118:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:70:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:305:in `block in transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:304:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migrations/lock_retry_mixin.rb:36:in `ddl_transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:115:in `configure_database'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:95:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Caused by:
Expected batched background migration for the given configuration to be marked as 'finished', but it is 'active': {:job_class_name=>"ProjectNamespaces::BackfillProjectNamespaces", :table_name=>:projects, :column_name=>:id, :job_arguments=>[nil, "up"]}
Finalize it manualy by running
sudo gitlab-rake gitlab:background_migrations:finalize[ProjectNamespaces::BackfillProjectNamespaces,projects,id,'[null\,"up"]']
For more information, check the documentation
https://docs.gitlab.com/ee/user/admin_area/monitoring/background_migrations.html#database-migrations-failing-because-of-batched-background-migration-not-finished
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:960:in `ensure_batched_background_migration_is_finished'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20220322071127_finalize_project_namespaces_backfill.rb:10:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:118:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:70:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:305:in `block in transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:304:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migrations/lock_retry_mixin.rb:36:in `ddl_transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:115:in `configure_database'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:95:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Tasks: TOP => db:migrate
(See full trace by running task with --trace)
Flow Example:
14.4
-> customer is here
14.5
-> batched background migration A
14.7
-> ensure_batched_background_migration_is_finished
migration
14.8
-> customers wants to jump to this version
In the example above, the user wants to jump from 14.4
to 14.8
, but when the customer runs the upgrade, he gets the error Expected batched background migration for the given configuration to be marked as 'finished'
. This happens because the user can only jump to version 14.8
when the batched background migration A is finished.
Real cases:
- omnibus-gitlab#6795 (closed)
- !83918 (comment 912501902)
- https://gitlab.slack.com/archives/CNZ8E900G/p1650883372129689
Solution:
We add a new helper method, or we change the ensure_batched_background_migration_is_finished
method to execute the batched background migration inline if the status is not finished.
We also need to make sure that the user receives feedback about the progress of the batched background migration. If the upgrade is canceled due to a timeout, the user needs to receive proper feedback about the next steps.