Create Issues for Groups to clean up duplicate indexes
As a follow up for: #385701 (closed)
We have identified many duplicate btree indexes that can be removed from our database, as they are not needed anymore. They can be found here: https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/database/duplicate_indexes.yml
By running some Ruby script (will attach this separately), I have generated mapping from groups
to the tables
that contain these duplicate indexes:
Click to expand
---
anti-abuse:
- abuse_reports
respond:
- alert_management_http_integrations
- incident_management_oncall_participants
- incident_management_oncall_schedules
optimize:
- analytics_cycle_analytics_group_stages
- merge_request_metrics
source_code:
- approval_project_rules_users
- approvals
- project_repositories
- protected_tags
project_management:
- board_group_recent_visits
- board_project_recent_visits
- board_user_preferences
- issue_links
- issues
- list_user_preferences
- sprints
- todos
- work_item_hierarchy_restrictions
product_planning:
- boards_epic_board_recent_visits
- boards_epic_user_preferences
- design_management_designs
- design_management_designs_versions
- related_epic_links
- requirements_management_test_reports
import_and_integrate:
- bulk_import_batch_trackers
- bulk_import_export_batches
- jira_connect_subscriptions
- project_relation_exports
- web_hook_logs
- web_hooks
pipeline_security:
- ci_job_artifacts
- ci_pipeline_artifacts
pipeline_execution:
- ci_stages
- taggings
dynamic_analysis:
- dast_site_tokens
observability:
- error_tracking_errors
geo:
- geo_node_namespace_links
activation:
- in_product_marketing_emails
- member_tasks
compliance:
- instance_audit_events_streaming_headers
- project_compliance_standards_adherence
tenant_scale:
- members
- project_topics
- projects
- users
code_review:
- merge_request_assignees
- merge_requests
mlops:
- ml_candidate_params
- ml_candidates
- ml_model_versions
- ml_models
runner:
- p_ci_runner_machine_builds
package_registry:
- packages_debian_group_distributions
- packages_debian_project_distributions
- packages_tags
knowledge:
- pages_domains
authentication_and_authorization:
- personal_access_tokens
- term_agreements
composition_analysis:
- pm_affected_packages
- pm_package_version_licenses
- pm_package_versions
environments:
- protected_environments
threat_insights:
- sbom_component_versions
- sbom_occurrences
- vulnerabilities
- vulnerability_external_issue_links
- vulnerability_finding_links
- vulnerability_finding_signatures
- vulnerability_flags
security_policies:
- scan_result_policies
global_search:
- search_namespace_index_assignments
foundations:
- user_callouts
As part of this issue, we need to:
- Agree on a template that we send to each group so that they clean up the duplicate indexes. Open question: Do we create 1 or multiple issues per group, if they own more than 1 table. I suggest we create an issue per table. We can refer to the duplicate indexes in https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/database/duplicate_indexes.yml instead of adding the indexes names manually in the issues.
- Create the issues and send them to the teams. We can use the Ruby script that generated the mapping to create issues texts. We can even expand it to create the issue via API. But this might take more time
Suggested Template (Work in progress). Feel free to suggest any time
TITLE: Remove <group> duplicated indexes
After https://gitlab.com/gitlab-org/gitlab/-/issues/385701, we can identify duplicated indexes. At the moment, there are many duplicate indexes that can be removed from our database, as they are not needed anymore.
The list of duplicated indexes can be found in the [`duplicate_indexes.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/database/duplicate_indexes.yml) file.
We have identified the following duplicated indexes owned by <group>:
- <index name>
Feel free to split this issue into several under https://gitlab.com/groups/gitlab-org/-/epics/11436 if that aligns more with how your group works.
See the references below to remove indexes. If the table is larger or with a lot of traffic, please consider dropping the index asynchronously.
<ADD MORE EXPLANATION AND CODE EXAMPLES>
References:
- https://docs.gitlab.com/ee/development/database/adding_database_indexes.html#drop-indexes-asynchronously
- https://docs.gitlab.com/ee/development/migration_style_guide.html#removing-indexes
cc <engineer manager> <product manager>
CC: @alexives
Edited by Arturo Herrero