Use CommitStatus scopes to explicitly destroy records instead of nullifying the foreign key
What does this MR do and why?
This MR breaks the project_id
querying out of CommitStatus.for_project_paths
so it can be reused with a specific project id.
We do this so we can send a DELETE
query directly to every record the ci_builds
table, unconstrained by CommitStatus type.
Currently:
pry(main)> project.commit_statuses.delete_all
CommitStatus Update All (0.5ms) UPDATE "ci_builds" SET "project_id" = NULL WHERE "ci_builds"."project_id" = 1
With this change:
pry(main)> CommitStatus.where(project_id: project.id).delete_all
CommitStatus Destroy (0.5ms) DELETE FROM "ci_builds" WHERE "ci_builds"."project_id" = 1
When destroying CI records, we need to fully ensure that all records in ci_builds
get destroyed before the project record goes. Right now, we are just setting the project_id
column of ci_builds
to null
.
This is because Project currently does not specify a deletion strategy for the relation:
has_many :commit_statuses
and so the default :nullify
strategy is used.
This is different than the builds relation, which has a specified strategy despite pointing to the same table:
has_many :builds, class_name: 'Ci::Build', inverse_of: :project, dependent: :destroy
So when we see the Projects::DestroyService
reporting leftover commit statuses, those commit status records are not being destroyed, only having their project_id
set to null
.
How to set up and validate locally
1. Initial Counts
[13] pry(main)> CommitStatus.count
(2.6ms) SELECT COUNT(*) FROM "ci_builds"
=> 663
[10] pry(main)> project = Project.find(4)
Project Load (1.0ms) SELECT ...
Route Load (0.3ms) SELECT ...
=> #<Project id:4 Commit451/lab-coat>>
[14] pry(main)> project.commit_statuses.count
(0.7ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> 68
[11] pry(main)> cs_ids = project.commit_statuses.pluck(:id)
(0.5ms) SELECT "ci_builds"."id" FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> [...]
[12] pry(main)> cs_ids.count
=> 68
2. Attempt delete_all
[15] pry(main)> project.commit_statuses.delete_all
CommitStatus Update All (21.3ms) UPDATE "ci_builds" SET "project_id" = NULL WHERE "ci_builds"."project_id" = 4
=> 68
3. Observe the same total counts, but the relation count gone to zero
[16] pry(main)> CommitStatus.count
(0.6ms) SELECT COUNT(*) FROM "ci_builds"
=> 663
[18] pry(main)> project.commit_statuses.count
(0.5ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> 0
[17] pry(main)> CommitStatus.where(project_id: nil).count
(0.6ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" IS NULL
=> 68
4. The records themselves still exist
[19] pry(main)> CommitStatus.where(id: cs_ids).count
(1.1ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."id" IN (...)
=> 68
5. Reset the project_id on the abandoned CommitStatus records
[36] pry(main)> CommitStatus.where(project_id: nil).update_all(project_id: project.id)
CommitStatus Update All (8.2ms) UPDATE "ci_builds" SET "project_id" = 4 WHERE "ci_builds"."project_id" IS NULL
=> 68
[37] pry(main)> CommitStatus.where(project_id: nil).count
(0.6ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" IS NULL
=> 0
6. Observe the same original total
[40] pry(main)> CommitStatus.count
(0.6ms) SELECT COUNT(*) FROM "ci_builds"
=> 663
[42] pry(main)> project.commit_statuses.count
(1.6ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> 68
7. Use new delete_all statement
[41] pry(main)> CommitStatus.for_project_id(project.id).delete_all
CommitStatus Destroy (84.1ms) DELETE FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> 68
8. Observe Updated totals
[44] pry(main)> CommitStatus.count
(1.8ms) SELECT COUNT(*) FROM "ci_builds"
=> 595
[45] pry(main)> project.commit_statuses.count
(0.7ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" = 4
=> 0
[46] pry(main)> CommitStatus.where(project_id: nil).count
(0.5ms) SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."project_id" IS NULL
=> 0
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Follow-up
After this is running in production, we should do a scan of the ci_builds
table (I know, I know) and destroy all the builds with a blank project_id
.
I say "destroy" specifically, because we might have to instantiate the build record itself and call ActiveRecord::Base#destroy
on it, in order to trigger the associated callbacks that destroy Ci::JobArtifact records
. Ci::PipelineDestroyService
is supposed to take care of Ci::JobArtifacts, so this might not be necessary, but I don't want to forget to think about it.
Right now the ON CASACADE DELETE
has taken care of us, so we don't need to go back and do any cleanup. But we'll have needed to do this in order to complete #340256 (closed)