Skip to content

Use a parameter to configure BG migrations

Patrick Bair requested to merge 343047-bg-migration-execution-by-schema into master

What does this MR do and why?

Related to #343047 (closed)

Prepares the BackgroundMigration module to work with multiple workers and databases. We intend to have a worker for each database in a decomposed setup, so we will introduce a new Ci::BackgroundMigrationWorker in a future MR.

The multi-database execution requires that a schema value be passed in, to select the "context" in which we run the migration jobs. We default the database to :main for now, but we will add support for :ci once we add the new ci worker.

The database will be provided initially in the migration when we enqueue the background migration, so we can select the corresponding worker class for the intended database.

Since we rearranged the work on this, !72916 (merged) is now the MR which updates the BackgroundMigrationWorker to use the new logic in preparation for adding a ci worker, and !73306 (merged) will setup the conditional logic to be able to select the correct worker when enqueueing jobs.

How to set up and validate locally

  1. Setup a test migration job:
    module Gitlab
      module BackgroundMigration
        class MyMigration
          def perform(start_id, stop_id)
            puts "called with: #{start_id},#{stop_id}"
            puts "called from:\n\t#{caller[0...2].join("\n\t")}"
          end
        end
      end
    end
  2. Run the job directly from the worker:
    BackgroundMigrationWorker.new.perform('MyMigration', [1, 10])
  3. Verify it executes with similar output:
    called with: 1,10
    called from:
        /Users/pbair/Projects/gitlab-development-kit/gitlab/lib/gitlab/background_migration/job_coordinator.rb:70:in `perform'
        /Users/pbair/Projects/gitlab-development-kit/gitlab/lib/gitlab/background_migration.rb:39:in `perform'
  4. Schedule a couple jobs for future execution:
    BackgroundMigrationWorker.perform_in(1.day, 'MyMigration', [20, 30])
    BackgroundMigrationWorker.perform_in(1.day, 'MyFakeMigration', [30, 40])
  5. Verify the two jobs are scheduled:
    s = Sidekiq::ScheduledSet.new
    s.select { |j| %w[MyMigration MyFakeMigration].include? j.args.first }.size # => 2
  6. Steal jobs matching only our test migration:
    Gitlab::BackgroundMigration.steal('MyMigration') { |job| puts job.args; true }
  7. Verify from the output it's running only the job with the specified name:
    MyMigration
    20
    30
    called with: 20,30
    called from:
        /Users/pbair/Projects/gitlab-development-kit/gitlab/lib/gitlab/background_migration/job_coordinator.rb:70:in `perform'
        /Users/pbair/Projects/gitlab-development-kit/gitlab/lib/gitlab/background_migration/job_coordinator.rb:58:in `block (2 levels) in steal'
    => [#<Sidekiq::ScheduledSet:0x00007fc8c19e9d00 @_size=1, @name="schedule">, #<Sidekiq::Queue:0x00007fc8c19e8798 @name="default", @rname="queue:default">]

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Patrick Bair

Merge request reports

Loading