Skip to content

Update Export workers to read from replica

What does this MR do?

This MR updates Project/Group Export workers to use replica db when performing export.

Few notes on the implementation approach:

  1. data_consistency :delayed option is available https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/concerns/worker_attributes.rb#L84 but it only works with workers that are retriable, which project export workers are not. They were made not retriable in !35344 (merged) since data showed that export worker retries are wasteful and generally do not make a difference in the end result. I tried reintroducing retries back to the export worker and using delayed approach but reads were still going to primary #338326 (comment 650801991)
  2. data_consistency :sticky is not an option since it keeps connection to a replica but then switches to primary on first read, but does not switch back to replica on the reads after

This change should be applicable to:

  • ProjectExportWorker
  • GroupExportWorker
  • RelationExportWoker

since they all use streaming serializer.

Example of export hitting replica with proposed change:

{
  "severity": "INFO",
  "time": "2021-08-13T14:52:21.273Z",
  "class": "ProjectExportWorker",
  "args": [
    "1",
    "52"
  ],
  ...
  "meta.feature_category": "importers",
  "correlation_id": "f4ae54ac9744a72e390a38a1a2a22abf",
  "idempotency_key": "resque:gitlab:duplicate:project_export:46746c03a500f8babbf198679902292e0e870bd8fc76cd1ce92246ddf0c66d56",
  "worker_data_consistency": "always",
  "enqueued_at": "2021-08-13T14:50:19.263Z",
  "job_size_bytes": 6,
  "pid": 40616,
  "message": "ProjectExportWorker JID-8aa75e94765f61aad3f4e0e1: done: 122.008985 sec",
  "job_status": "done",
...
  "db_count": 4992,
  "db_write_count": 17,
  "db_cached_count": 401,
  "db_replica_count": 4956,
  "db_replica_cached_count": 401,
  "db_replica_wal_count": 0,
  "db_replica_wal_cached_count": 0,
  "db_primary_count": 36,
  "db_primary_cached_count": 0,
  "db_primary_wal_count": 0,
  "db_primary_wal_cached_count": 0,
  "db_replica_duration_s": 7.822,
  "db_primary_duration_s": 0.061,
  "cpu_s": 81.000855,
  "duration_s": 122.008985,
  "completed_at": "2021-08-13T14:52:21.273Z",
  "load_balancing_strategy": "primary",
  "db_duration_s": 4.462793
}

Mentions #338326 (closed)

Screenshots or Screencasts (strongly suggested)

How to setup and validate locally (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by George Koltsov

Merge request reports

Loading