Skip to content

Org Mover: Order tables by dependencies

Michael Kozono requested to merge gitlab-order-tables-by-dependencies into master

What does this MR do and why?

Continuing a community contribution, and making minor fixes on top of it !156333 (closed).

Here is the original description:

Issue: https://gitlab.com/gitlab-org/gitlab/-/issues/458759#note_1945319018

Overview

This script is only used for research in https://gitlab.com/gitlab-org/gitlab/-/issues/458759, to explore the shape of data in any GitLab PG DB.

This MR introduces several fixes and improvements to the script for generating COPY commands for PostgreSQL data export. The main changes include:

  • Correct retrieval of table dependencies for proper ordering of COPY commands.
  • Enhanced logic for filtering tables based on specified columns.
  • Introduction of a --debug flag for easier debugging and diagnostics.

Why This is Needed

Previously, the script incorrectly handled table dependencies, leading to errors when executing COPY commands. These changes ensure the correct order of execution, which is critical for exporting related data accurately.

How It Works

  1. The script retrieves table dependencies using an SQL query to PostgreSQL system tables (get_table_dependencies).
  2. Dependencies are grouped and sorted to ensure the correct order of COPY command execution (order_tables_by_dependencies).
  3. Еhe --debug flag has been expanded, enabling debugging mode for detailed output of executed commands and retrieved data.

Example:

Initial table names: ["activity_pub_releases_subscriptions", "agent_activity_events", "agent_project_authorizations", "agent_user_access_project_authorizations", "ai_agent_version_attachments", "ai_agent_versions", "ai_agents", "ai_vectorizable_files", "alert_management_alerts", "alert_management_http_integrations", "analytics_cycle_analytics_issue_stage_events", "analytics_cycle_analytics_merge_request_stage_events", "analytics_language_trend_repository_languages", "approval_merge_request_rule_sources", "approval_merge_request_rules", "approval_project_rules", "approval_project_rules_users", "badges", "board_project_recent_visits", "boards", "bulk_import_exports", "catalog_resource_components", "catalog_resource_versions", "catalog_resources", "ci_build_report_results", "ci_builds", "ci_builds_metadata", "ci_daily_build_group_report_results", "ci_freeze_periods", "ci_job_artifacts", "ci_pipeline_artifacts", "ci_pipeline_metadata", "ci_pipeline_schedules", "ci_pipelines", "ci_project_monthly_usages", "ci_refs", "ci_resource_groups", "ci_runner_projects", "ci_running_builds", "ci_secure_files", "ci_sources_pipelines", "ci_stages", "ci_triggers", "ci_unit_tests", "ci_variables", "cluster_agent_tokens", "cluster_agents", "cluster_projects", "clusters_kubernetes_namespaces", "container_expiration_policies", "container_registry_data_repair_details", "container_registry_protection_rules", "container_repositories", "coverage_fuzzing_corpuses", "csv_issue_imports", "custom_software_licenses", "dast_pre_scan_verifications", "dast_profile_schedules", "dast_profiles", "dast_profiles_tags", "dast_scanner_profiles", "dast_site_profile_secret_variables", "dast_site_profiles", "dast_site_tokens", "dast_site_validations", "dast_sites", "dependency_proxy_packages_settings", "deploy_keys_projects", "deployment_approvals", "deployments", "dora_configurations", "dora_daily_metrics", "dora_performance_scores", "draft_notes", "elasticsearch_indexed_projects", "environments", "error_tracking_client_keys", "error_tracking_errors", "events", "external_approval_rules", "external_pull_requests", "external_status_checks", "fork_network_members", "gpg_signatures", "grafana_integrations", "import_export_uploads", "import_failures", "incident_management_escalation_policies", "incident_management_oncall_schedules", "incident_management_timeline_event_tags", "incident_management_timeline_events", "index_statuses", "integrations", "jira_imports", "label_priorities", "labels", "lfs_file_locks", "lfs_objects_projects", "merge_request_assignees", "merge_request_assignment_events", "merge_request_blocks", "merge_request_diffs", "merge_request_requested_changes", "metrics_users_starred_dashboards", "milestones", "ml_candidates", "ml_experiment_metadata", "ml_experiments", "ml_model_version_metadata", "ml_model_versions", "ml_models", "operations_feature_flags", "operations_feature_flags_clients", "operations_feature_flags_issues", "operations_strategies", "operations_strategies_user_lists", "operations_user_lists", "p_batched_git_ref_updates_deletions", "p_catalog_resource_component_usages", "p_catalog_resource_sync_events", "p_ci_build_names", "p_ci_build_sources", "p_ci_builds", "p_ci_builds_execution_configs", "p_ci_builds_metadata", "p_ci_job_artifacts", "p_ci_stages", "packages_build_infos", "packages_cleanup_policies", "packages_debian_project_distribution_keys", "packages_debian_project_distributions", "packages_debian_publications", "packages_npm_metadata_caches", "packages_packages", "packages_protection_rules", "packages_rpm_repository_files", "packages_tags", "packages_terraform_module_metadata", "pages_deployments", "pages_domains", "path_locks", "project_access_tokens", "project_alerting_settings", "project_aliases", "project_audit_events", "project_authorizations", "project_auto_devops", "project_build_artifacts_size_refreshes", "project_ci_cd_settings", "project_ci_feature_usages", "project_compliance_framework_settings", "project_custom_attributes", "project_daily_statistics", "project_deploy_tokens", "project_error_tracking_settings", "project_export_jobs", "project_feature_usages", "project_features", "project_group_links", "project_import_data", "project_incident_management_settings", "project_metrics_settings", "project_mirror_data", "project_pages_metadata", "project_relation_exports", "project_repositories", "project_repository_storage_moves", "project_saved_replies", "project_security_settings", "project_settings", "project_states", "project_topics", "project_wiki_repositories", "projects_sync_events", "prometheus_alert_events", "prometheus_alerts", "prometheus_metrics", "protected_environments", "protected_tags", "relation_import_trackers", "release_links", "releases", "remote_development_agent_configs", "remote_mirrors", "repository_languages", "requirements", "reviews", "sbom_occurrences", "sbom_occurrences_vulnerabilities", "scan_result_policies", "scan_result_policy_violations", "security_policy_project_links", "security_scans", "security_trainings", "self_managed_prometheus_alert_events", "sent_notifications", "service_desk_custom_email_credentials", "service_desk_custom_email_verifications", "service_desk_settings", "snippets", "software_license_policies", "ssh_signatures", "status_check_responses", "status_page_settings", "subscriptions", "target_branch_rules", "terraform_state_versions", "terraform_states", "timelogs", "todos", "trending_projects", "user_project_callouts", "users_ops_dashboard_projects", "users_security_dashboard_projects", "users_star_projects", "vulnerabilities", "vulnerability_feedback", "vulnerability_historical_statistics", "vulnerability_identifiers", "vulnerability_merge_request_links", "vulnerability_occurrences", "vulnerability_remediations", "vulnerability_scanners", "vulnerability_state_transitions", "vulnerability_statistics", "vulnerability_user_mentions", "web_hooks", "wiki_page_meta", "wiki_page_slugs", "wiki_repository_states", "workspace_variables", "workspaces", "x509_commit_signatures", "xray_reports", "zoekt_repositories", "zoom_meetings"]
Ordered table names: ["external_pull_requests", "ci_pipeline_schedules", "ci_refs", "ci_pipelines", "ci_stages", "p_ci_stages", "ci_resource_groups", "p_ci_builds_execution_configs", "ci_builds", "p_ci_builds", "cluster_agents", "cluster_agent_tokens", "merge_request_diffs", "milestones", "agent_activity_events", "ci_sources_pipelines", "vulnerability_scanners", "vulnerability_identifiers", "vulnerability_occurrences", "vulnerabilities", "ci_running_builds", "reviews", "incident_management_timeline_events", "ml_models", "ml_experiments", "packages_packages", "ml_model_versions", "ml_candidates", "sbom_occurrences", "timelogs", "todos", "vulnerability_feedback", "environments", "prometheus_metrics", "prometheus_alerts", "alert_management_alerts", "scan_result_policies", "approval_merge_request_rules", "ci_build_report_results", "cluster_projects", "clusters_kubernetes_namespaces", "deployments", "protected_environments", "deployment_approvals", "p_ci_build_names", "p_ci_build_sources", "p_ci_builds_metadata", "p_ci_job_artifacts", "custom_software_licenses", "software_license_policies", "workspaces", "ai_agents", "ai_agent_versions", "ai_vectorizable_files", "ai_agent_version_attachments", "approval_project_rules", "approval_merge_request_rule_sources", "approval_project_rules_users", "boards", "board_project_recent_visits", "releases", "catalog_resources", "catalog_resource_versions", "catalog_resource_components", "ci_job_artifacts", "coverage_fuzzing_corpuses", "dast_site_tokens", "dast_site_validations", "dast_sites", "dast_site_profiles", "dast_scanner_profiles", "dast_profiles", "dast_profile_schedules", "draft_notes", "fork_network_members", "gpg_signatures", "integrations", "labels", "jira_imports", "merge_request_assignees", "merge_request_assignment_events", "merge_request_blocks", "operations_feature_flags", "operations_feature_flags_issues", "operations_strategies", "operations_user_lists", "operations_strategies_user_lists", "p_catalog_resource_component_usages", "packages_debian_project_distributions", "packages_debian_publications", "scan_result_policy_violations", "ssh_signatures", "external_status_checks", "status_check_responses", "terraform_states", "terraform_state_versions", "vulnerability_merge_request_links", "vulnerability_state_transitions", "vulnerability_user_mentions", "web_hooks", "agent_project_authorizations", "agent_user_access_project_authorizations", "analytics_language_trend_repository_languages", "badges", "bulk_import_exports", "csv_issue_imports", "dast_pre_scan_verifications", "dast_profiles_tags", "dast_site_profile_secret_variables", "dora_daily_metrics", "events", "import_export_uploads", "import_failures", "label_priorities", "lfs_file_locks", "lfs_objects_projects", "metrics_users_starred_dashboards", "ml_experiment_metadata", "ml_model_version_metadata", "packages_build_infos", "packages_debian_project_distribution_keys", "packages_tags", "packages_terraform_module_metadata", "path_locks", "project_access_tokens", "project_compliance_framework_settings", "project_deploy_tokens", "project_export_jobs", "project_group_links", "pages_deployments", "project_pages_metadata", "project_relation_exports", "project_repositories", "project_settings", "project_topics", "prometheus_alert_events", "release_links", "remote_development_agent_configs", "requirements", "sbom_occurrences_vulnerabilities", "security_policy_project_links", "security_trainings", "self_managed_prometheus_alert_events", "service_desk_custom_email_verifications", "service_desk_settings", "user_project_callouts", "users_ops_dashboard_projects", "users_security_dashboard_projects", "wiki_page_meta", "wiki_page_slugs", "project_wiki_repositories", "wiki_repository_states", "workspace_variables", "x509_commit_signatures", "zoekt_repositories", "zoom_meetings", "activity_pub_releases_subscriptions", "alert_management_http_integrations", "ci_builds_metadata", "ci_daily_build_group_report_results", "ci_pipeline_artifacts", "ci_pipeline_metadata", "container_expiration_policies", "container_registry_data_repair_details", "container_registry_protection_rules", "container_repositories", "dependency_proxy_packages_settings", "deploy_keys_projects", "dora_configurations", "dora_performance_scores", "elasticsearch_indexed_projects", "error_tracking_client_keys", "error_tracking_errors", "external_approval_rules", "grafana_integrations", "incident_management_escalation_policies", "incident_management_oncall_schedules", "incident_management_timeline_event_tags", "index_statuses", "operations_feature_flags_clients", "packages_cleanup_policies", "packages_npm_metadata_caches", "packages_protection_rules", "packages_rpm_repository_files", "pages_domains", "project_alerting_settings", "project_aliases", "project_authorizations", "project_auto_devops", "project_build_artifacts_size_refreshes", "project_ci_cd_settings", "project_ci_feature_usages", "project_custom_attributes", "project_daily_statistics", "project_error_tracking_settings", "project_feature_usages", "project_features", "project_import_data", "project_incident_management_settings", "project_metrics_settings", "project_mirror_data", "project_repository_storage_moves", "project_saved_replies", "project_security_settings", "project_states", "projects_sync_events", "protected_tags", "relation_import_trackers", "remote_mirrors", "repository_languages", "security_scans", "sent_notifications", "service_desk_custom_email_credentials", "snippets", "status_page_settings", "subscriptions", "target_branch_rules", "trending_projects", "users_star_projects", "vulnerability_historical_statistics", "vulnerability_remediations", "vulnerability_statistics", "xray_reports", "analytics_cycle_analytics_issue_stage_events", "analytics_cycle_analytics_merge_request_stage_events", "ci_freeze_periods", "ci_project_monthly_usages", "ci_runner_projects", "ci_secure_files", "ci_triggers", "ci_unit_tests", "ci_variables", "merge_request_requested_changes", "p_batched_git_ref_updates_deletions", "p_catalog_resource_sync_events", "project_audit_events"]

Here you can see that the "Ordered table names" contain the same tables but in a different order, according to the dependencies on foreign keys.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Merge request reports

Loading