Strip HTML and/or truncate excessively long comments on vulnerability_feedback table
What does this MR do and why?
Describe in detail what your merge request does and why.
This MR introduces a batched background migration job that strips HTML tags and/or truncates excessively long comments on vulnerability_feedback
table
Related to #383703 (closed) step 4
Database review
Truth be told this is going to be a no-op on GitLab.com because our longest record in that table is 2_817
characters long but @ahegyi
was right to suggest that we don't know that's the case for our self-hosted customers. I think 50 000 is unlikely but it's better to be safe than sorry.
Batch selection
SELECT "vulnerability_feedback"."id" FROM "vulnerability_feedback" WHERE "vulnerability_feedback"."id" BETWEEN 1 AND 596672 AND (char_length(comment) > 50000) ORDER BY "vulnerability_feedback"."id" ASC LIMIT 1
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13624/commands/47800
SELECT "vulnerability_feedback"."id" FROM "vulnerability_feedback" WHERE "vulnerability_feedback"."id" BETWEEN 1 AND 596672 AND (char_length(comment) > 50000) AND "vulnerability_feedback"."id" >= 11 ORDER BY "vulnerability_feedback"."id" ASC LIMIT 1 OFFSET 250
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13624/commands/47803
SELECT "vulnerability_feedback".* FROM "vulnerability_feedback" WHERE "vulnerability_feedback"."id" BETWEEN 472478 AND 472728 AND (char_length(comment) > 50000) AND "vulnerability_feedback"."id" >= 472478
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13624/commands/47804
Record update
explain UPDATE "vulnerability_feedback" SET "updated_at" = '2022-12-01 18:25:43.551210', "comment" = 'definitely shorter' WHERE "vulnerability_feedback"."id" = 12
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13624/commands/47805
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.