Limit updates to Web Hook backoff interval
If a Web hook times out, this is treated as an error, and
Webhook#backoff!
is executed. However, if the hook fires repeatedly,
which is common for a system hook or group hook, it's possible for this
backoff to update the same row repeatedly via
WebHooks::LogExecutionWorker
jobs. This not only generates unnecessary
table bloat, but it can cause a significant performance degradation when
a long transaction has started.
These concurrent row updates can cause PostgreSQL to allocate multixact
transaction IDs. A SELECT call will cause PostgreSQL to prune tuples in
an opportunistic way, but this pruning may be significantly slowed if
the window of multiexact tuples grows over time. Once the simple LRU
cache can no longer fit the multixact XIDs the in-memory cache, we will see
slowdowns when accessing the web_hooks
table.
To avoid this, we cap the number of backoffs to 100 (MAX_FAILURES
) and
only update the row if the disabled_until
time has elapsed. This
should ensure the hook only fires once every 24 hours and only updates
the row once during that time.
Relates to #340272 (closed)