Avoid duplicate jobs in the replication queue
We seen a number of situations where the replication queue grew to a massive amount of rows.
This could be caused by syncs that keep on failing, while the coordinator keeps on Enqueue()
jobs.
At the moment duplicate jobs are cleaned up on Acknowledge()
, but after some investigation some of the conditions to clean up are too strict:
- Only jobs from the same source storage are cleaned. I don't think we need to store the source in the replications job, the replicator should pick one up-to-date source at the moment replication is initiated
- We can be smarter at looking at
job->>'change'
, for example a'delete'
job can overrule a'update'
job.