Manifest deletes cascade to manifest_references on child_id
Problem
Related to #255 (closed).
While working on !508 (merged), one of the integration tests (TestRepositoryStore_DeleteManifest
) started to fail with the following error:
deleting manifest: ERROR: insert or update on table "gc_manifest_review_queue" violates foreign key constraint "fk_gc_manifest_review_queue_repo_id_and_manifest_id_manifests" (SQLSTATE 23503)
This test deletes a manifest that is referenced in a manifest list within the same repository. Looking at the DDL we can see that manifest_references
has a FK on (repository_id, child_id)
for manifests(repository_id, id)
with ON DELETE CASCADE
.
As a consequence, when we delete a manifest that is a child manifest to a manifest list, the delete on manifests
cascades to manifest_references
. This then triggers gc_track_deleted_manifest_lists_trigger
(introduced in !508 (merged)), which will attempt to insert (repository_id, child_id)
in gc_manifest_review_queue
.
Finally, because we recently introduced (!497 (merged)) a FK constraint on gc_manifest_review_queue(repository_id, manifest_id)
pointing to manifests(repository_id, id)
, the gc_track_deleted_manifest_lists_trigger
fails with the error above because (within the context of the running transaction) the child manifest is no longer in manifests
, thus the foreign key constraint violation.
This does not result in a data integrity breach because the implicit DB transaction is rolled back when the trigger fails, so nothing is deleted.
Example
manifests
id | repository_id | Type |
---|---|---|
1 | 1 | manifest |
2 | 1 | manifest list |
manifest_references
repository_id | parent_id | child_id |
---|---|---|
1 | 2 | 1 |
gc_manifest_review_queue
repository_id | manifest_id |
---|---|
1 | 1 |
1 | 2 |
Deleting manifests(1,1)
would produce the described error.
Proposal
With the filesystem-backed API (no metadata database), we can currently delete a manifest (link) that is referenced in a manifest list. I believe this is why we initially decided to cascade by child_id
and not just by parent_id
.
However, apart from creating problems at the DB level, functionally, it does not make sense to allow a manifest to be deleted from a repository if in the same repository there is a manifest list that depends on it. Although we could still continue to serve the manifest list (we store its payload, so we don't need to recreate it by joining the individual manifests information), a client would be unable to retrieve that specific manifest from the list.
For this reason, I think we should not cascade by child_id
, only by parent_id
. This can be done by replacing ON DELETE CASCADE
with ON DELETE RESTRICT
. Doing this will prevent a manifest from being deleted if it's still referenced in manifest_references
, which is a nice integrity guarantee to have at the DB level.
At the application level, we will have to introduce a lookup to identify whether a manifest is referenced in any list before allowing it to be deleted (alternatively we can catch the DB error and treat it accordingly). This will lead to a new API response code, as at the moment there are no integrity restrictions when deleting manifests.