Geo: Orphaned uploads lead to "Sync timed out after 28800"
Orphaned uploads apparently lead to "Sync timed out after 28800". It doesn't seem like an appropriate failure mode.
Example: #417164 (closed)
- It implies that the sync jobs are exiting without updating state (or worse, hanging forever?).
- An orphaned upload is a relatively common occurrence and easy to detect on a secondary site without even having to do a request against the primary site.
- It consumes the concurrency limit for 8 hours.
In this case, the secondary should:
- When attempting to sync a model_record, check if it is
lost_orphan?
(I added "lost" because it is technically possible for a model_record to know where its data is without its parent... it depends on the implementation of the path). - Treat it as a failure immediately without attempting a request for the resource.
- Log a descriptive failure message, e.g.
Upload with ID X is orphaned. Model with class Y and ID Z does not exist in the database.
- Set the retry_at to a relatively long time from now, e.g. 24 hours, since it is a data integrity problem that will almost always persist forever in the PG database.
Edited by Michael Kozono