Move gitlab:elastic:projects_not_indexed to finder
What does this MR do and why?
Related to #384039 (closed)
- Move
projects_not_indexed
code from rake task to a finder (to allow it to be called in other places) - Update specs for rake task
- Add new specs for finder
Screenshots or screen recordings
N/A
How to set up and validate locally
- Setup gdk for elasticsearch and index everything
- Run the rake task, everything is indexed
bundle exec rake gitlab:elastic:projects_not_indexed All projects are currently indexed
- Remove a few projects from
index_statuses
table:Project.all.sample(5).each {|x| x.index_status.delete }
- Run the rake task, verify the 5 projects are listed:
bundle exec rake gitlab:elastic:projects_not_indexed
bundle exec rake gitlab:elastic:projects_not_indexed Project 'jashkenas/Underscore' (ID: 6) isn't indexed. Project 'pamula/Flight' (ID: 17) isn't indexed. Project 'jtleek/Datasharing' (ID: 14) isn't indexed. Project 'jlevy/the-art-of-command-line' (ID: 11) isn't indexed. Project 'earleen/Flight' (ID: 23) isn't indexed. 5 out of 5 non-indexed projects shown.
Database
Note that this service cannot be run on GitLab.com (the query times out and takes way too long).
The timings from database-lab are provided using Project.all
vs. ::Gitlab::CurrentSettings.elasticsearch_limited_projects
postgres.ai timings - no index limiting enabled
Project.all.not_indexed_in_elasticsearch.each_batch do |batch|
https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/21224/commands/69204
SELECT
"projects"."id"
FROM
"projects"
LEFT JOIN index_statuses ON projects.id = index_statuses.project_id
WHERE
"index_statuses"."project_id" IS NULL
ORDER BY
"projects"."id" ASC
LIMIT 1
https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/21224/commands/69205
SELECT
"projects"."id"
FROM
"projects"
LEFT JOIN index_statuses ON projects.id = index_statuses.project_id
WHERE
"index_statuses"."project_id" IS NULL
AND "projects"."id" >= 1
ORDER BY
"projects"."id" ASC
LIMIT 1 OFFSET 1000
https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/21224/commands/69206
SELECT
"projects"."id",
"projects"."name",
"projects"."path",
"projects"."description",
"projects"."created_at",
"projects"."updated_at",
"projects"."creator_id",
"projects"."namespace_id",
"projects"."last_activity_at",
"projects"."import_url",
"projects"."visibility_level",
"projects"."archived",
"projects"."avatar",
"projects"."merge_requests_template",
"projects"."star_count",
"projects"."merge_requests_rebase_enabled",
"projects"."import_type",
"projects"."import_source",
"projects"."approvals_before_merge",
"projects"."reset_approvals_on_push",
"projects"."merge_requests_ff_only_enabled",
"projects"."issues_template",
"projects"."mirror",
"projects"."mirror_last_update_at",
"projects"."mirror_last_successful_update_at",
"projects"."mirror_user_id",
"projects"."shared_runners_enabled",
"projects"."runners_token",
"projects"."build_allow_git_fetch",
"projects"."build_timeout",
"projects"."mirror_trigger_builds",
"projects"."pending_delete",
"projects"."public_builds",
"projects"."last_repository_check_failed",
"projects"."last_repository_check_at",
"projects"."only_allow_merge_if_pipeline_succeeds",
"projects"."has_external_issue_tracker",
"projects"."repository_storage",
"projects"."repository_read_only",
"projects"."request_access_enabled",
"projects"."has_external_wiki",
"projects"."ci_config_path",
"projects"."lfs_enabled",
"projects"."description_html",
"projects"."only_allow_merge_if_all_discussions_are_resolved",
"projects"."repository_size_limit",
"projects"."printing_merge_request_link_enabled",
"projects"."auto_cancel_pending_pipelines",
"projects"."service_desk_enabled",
"projects"."cached_markdown_version",
"projects"."delete_error",
"projects"."last_repository_updated_at",
"projects"."disable_overriding_approvers_per_merge_request",
"projects"."storage_version",
"projects"."resolve_outdated_diff_discussions",
"projects"."remote_mirror_available_overridden",
"projects"."only_mirror_protected_branches",
"projects"."pull_mirror_available_overridden",
"projects"."jobs_cache_index",
"projects"."external_authorization_classification_label",
"projects"."mirror_overwrites_diverged_branches",
"projects"."pages_https_only",
"projects"."external_webhook_token",
"projects"."packages_enabled",
"projects"."merge_requests_author_approval",
"projects"."pool_repository_id",
"projects"."runners_token_encrypted",
"projects"."bfg_object_map",
"projects"."detected_repository_languages",
"projects"."merge_requests_disable_committers_approval",
"projects"."require_password_to_approve",
"projects"."max_pages_size",
"projects"."max_artifacts_size",
"projects"."pull_mirror_branch_prefix",
"projects"."remove_source_branch_after_merge",
"projects"."marked_for_deletion_at",
"projects"."marked_for_deletion_by_user_id",
"projects"."autoclose_referenced_issues",
"projects"."suggestion_commit_message",
"projects"."project_namespace_id",
"projects"."hidden"
FROM
"projects"
LEFT JOIN index_statuses ON projects.id = index_statuses.project_id
WHERE
"index_statuses"."project_id" IS NULL
AND "projects"."id" >= 1
AND "projects"."id" < 1000
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Terri Chu