Skip to content

Do not requeue the indexing worker if failures occur

Background

Related to #413524 (closed)

Currently the bulk cron worker will requeue itself if records remain in the queue instead of waiting for the scheduled worker to start. The worker is on a 1 minute schedule. This poses a problem when many records are not indexed due to failures.

What does this MR do and why?

This MR introduces a change in the requeue logic:

  1. return the number of records that failed to index to the worker
  2. do not requeue if any failures happened during indexing
  3. updates to specs

Screenshots or screen recordings

N/A - all work is in background jobs

How to set up and validate locally

  1. setup gdk for elasticsearch
  2. checkout master branch, probably a good idea to restart background jobs: gdk restart rails-background-jobs
  3. introduce a new field in issue mapping
    diff --git a/ee/lib/elastic/latest/issue_instance_proxy.rb b/ee/lib/elastic/latest/issue_instance_proxy.rb
    index 0b054b3acf89..3d694d2b7540 100644
    --- a/ee/lib/elastic/latest/issue_instance_proxy.rb
    +++ b/ee/lib/elastic/latest/issue_instance_proxy.rb
    @@ -28,6 +28,8 @@ def as_indexed_json(options = {})
             data['namespace_ancestry_ids'] = target.namespace_ancestry
             data['label_ids'] = target.label_ids.map(&:to_s)
    
    +        data['i_am_missing'] = 'TEST'
    +
             if ::Elastic::DataMigrationService.migration_has_finished?(:add_hashed_root_namespace_id_to_issues)
               data['hashed_root_namespace_id'] = target.project.namespace.hashed_root_namespace_id
             end
    
  4. reindex everything from scratch: bundle exec rake gitlab:elastic:index
  5. open rails console and start the initial bulk cron worker: ElasticIndexInitialBulkCronWorker.new.perform
  6. Elastic::ProcessInitialBookkeepingService.queue_size should continue to have records in it despite the cron worker running
  7. you should see the ElasticIndexInitialBulkCronWorker worker continue to re-queue itself in sidekiq logs and the indexing attempts will show up in elasticsearch.log
  8. checkout this branch (make sure you still have the new field added in issue config)
  9. restart background jobs: gdk restart rails-background-jobs
  10. reindex everything from scratch: bundle exec rake gitlab:elastic:index
  11. open rails console and start the initial bulk cron worker: ElasticIndexInitialBulkCronWorker.new.perform
  12. you should NOT see the ElasticIndexInitialBulkCronWorker worker to re-queue itself in sidekiq logs, it should run only every 1 minute as scheduled.

Note: there are 16 shards that get processed so you will see a message for each shard with the shard number being send in args, BUT it should not be happening repeatedly for each shard

{"severity":"INFO","time":"2023-05-30T16:18:05.807Z","retry":0,"queue":"default","backtrace":true,"version":0,"queue_namespace":"cronjob","args":["13"],"class":"ElasticIndexInitialBulkCronWorker

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports

Loading