Skip to content

Geo: Skip blob download if already exists

Michael Kozono requested to merge mk/skip-blob-download-if-exists into master

What does this MR do and why?

Describe in detail what your merge request does and why.

Skip blob download if it already exists.

Resolves #352530 (closed)

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

If you don't have a GDK + Geo:

  1. Install GDK + Geo https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/geo.md#easy-installation
  2. It will have seeded many things by default

Confirm that nothing changes when the feature flag is disabled (which is the default).

  1. In your primary GDK/gitlab directory, switch to this branch: git checkout mk/skip-blob-download-if-exists && gdk restart rails
  2. In your secondary GDK/gitlab directory, switch to this branch: git checkout mk/skip-blob-download-if-exists && gdk restart rails
  3. In your secondary GDK/gitlab directory, tail relevant log output: tail -f log/geo.log | grep "Blob download"
  4. Open a new Terminal tab
  5. In your primary GDK/gitlab directory, tail relevant log output: gdk tail gitlab-workhorse | grep "api/v4/geo/retrieve/upload"
  6. Open a new Terminal tab
  7. In your secondary GDK/gitlab directory, delete the registry records of 3 Uploads: gdk psql-geo -c "DELETE FROM file_registry WHERE id IN (SELECT id FROM file_registry WHERE state = 2 AND verification_state = 2 LIMIT 3);".
  8. After a couple minutes (or after you run ::Geo::Secondary::RegistryConsistencyWorker.new.perform && Geo::RegistrySyncWorker.perform_async), then the secondary site will resync these Uploads.
  9. The secondary's Geo log will output something like {"severity":"INFO","time":"2023-12-12T01:28:57.699Z","correlation_id":"1e51c2e3bc90c8cef5aad2cebaafecd9","class":"Geo::BlobDownloadService","gitlab_host":"gdk2.test","message":"Blob download","replicable_name":"upload","model_record_id":4,"mark_as_synced":true,"download_success":true,"bytes_downloaded":65,"primary_missing_file":false,"download_time_s":0.052,"reason":null}.
  10. The primary's Workhorse log will output something like 2023-12-12_01:54:49.90730 gitlab-workhorse : {"content_type":"application/octet-stream","correlation_id":"01HHDVQSCEX05TKR90V7SV58X8","duration_ms":36,"host":"gdk.test:3443","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"172.16.123.1:64626","remote_ip":"172.16.123.1","route":"^/api/","status":200,"system":"http","time":"2023-12-11T15:54:49-10:00","ttfb_ms":36,"uri":"/api/v4/geo/retrieve/upload/25","user_agent":"http.rb/5.1.1","written_bytes":65}
  11. Browse to, or create, an issue. Add an attachment in a comment or in the description. Observe the logs.
  12. Browse to Admin Area > Geo > Sites > Replication Details > Uploads. Click Resync on a few of them. Observe the logs.
  13. In particular, notice the absence of "skipped":true in any of the logs. And notice that the primary site's Workhorse receives a request for each Upload.

Now enable the feature and perform the above actions again:

  1. In your primary GDK/gitlab directory, enable the feature flag: Feature.enable(:geo_skip_download_if_exists)
  2. We expect "skipped":true for only the case where we delete registry records. And the primary site's Workhorse does not receive a request when the secondary decides to skip the download.

You can also test that verification still works:

  1. In the secondary site Rails console, Geo::UploadRegistry.first. The file_id happened to be 4.

  2. In the secondary site Rails console, Geo::UploadRegistry.find_by(file_id: 4).replicator.carrierwave_uploader.file.path. This output the path to the upload file => "/Users/mkozonogitlab/Developer/gdk2/gitlab/public/uploads/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b/6d5eee4fc72fffd83d024115c6866b81/seeded_upload.txt"

  3. Corrupt the file: I opened the file in a text editor and modified it.

  4. In the secondary site Rails console, Geo::UploadRegistry.find_by(file_id: 4).replicator.verify

  5. In the secondary site Rails console, the registry record is now "sync failed" and "verification failed":

    [36] pry(main)> Geo::UploadRegistry.find_by(file_id: 4).reload
      Geo::UploadRegistry Load (0.3ms)  SELECT "file_registry".* FROM "file_registry" WHERE "file_registry"."file_id" = $1 LIMIT $2 /*application:console, db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:(pry):36:in `__pry__'*/  [["file_id", 4], ["LIMIT", 1]]
      Geo::UploadRegistry Load (0.1ms)  SELECT "file_registry".* FROM "file_registry" WHERE "file_registry"."id" = $1 LIMIT $2 /*application:console, db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:(pry):36:in `__pry__'*/  [["id", 67], ["LIMIT", 1]]
    => #<Geo::UploadRegistry:0x00000001437bf2c0
    id: 67,
    file_id: 4,
    created_at: Tue, 12 Dec 2023 00:53:31.817629000 UTC +00:00,
    retry_count: 1,
    retry_at: Tue, 12 Dec 2023 01:28:36.114777000 UTC +00:00,
    missing_on_primary: false,
    state: 3,
    last_synced_at: Tue, 12 Dec 2023 00:54:07.591014000 UTC +00:00,
    last_sync_failure: "Verification failed with: Checksum does not match the primary checksum  {:checksum=>\"dcc13385700f84ab63961a0c88d20e1ff79e97493945f0673f5653f45ac93bcc\",  :primary_checksum=>\"85418cc881d37d83c7e681bc43f63731bf0849e06dc59fa8fa2dcf5448a47b8e\"}",
    verified_at: Tue, 12 Dec 2023 01:27:50.114645000 UTC +00:00,
    verification_started_at: Tue, 12 Dec 2023 01:27:50.101930000 UTC +00:00,
    verification_retry_at: Tue, 12 Dec 2023 01:28:14.114560000 UTC +00:00,
    verification_state: 3,
    verification_retry_count: 1,
    verification_checksum: "dcc13385700f84ab63961a0c88d20e1ff79e97493945f0673f5653f45ac93bcc",
    verification_checksum_mismatched: "dcc13385700f84ab63961a0c88d20e1ff79e97493945f0673f5653f45ac93bcc",
    checksum_mismatch: true,
    verification_failure: "Checksum does not match the primary checksum {:checksum=>\"dcc13385700f84ab63961a0c88d20e1ff79e97493945f0673f5653f45ac93bcc\",  :primary_checksum=>\"85418cc881d37d83c7e681bc43f63731bf0849e06dc59fa8fa2dcf5448a47b8e\"}">
  6. In the secondary site Rails console, Geo::RegistrySyncWorker.perform_async to get the system to resync it

  7. Open the file in a text editor again-- it is fixed!

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michael Kozono

Merge request reports

Loading