fix(notifications): retrying sink does not stop after threshold is reached
The retryingSink
in the notifications
system does not stop after reaching the threshold, instead it keeps attempting to make the connection. This has unintended consequences as the Go routine is leaking because we never stop.
ERRO[1043] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1043] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1044] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1044] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1045] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1045] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1046] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1046] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1047] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1047] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1048] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1048] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1049] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1049] httpSink{http://registry.test:3333/} encountered too many errors, backing off
Solution
We have introduced a maxretries
parameter to the notifications section via !1606 (merged).
To fix this issues, users will need to wait for the new release of the registry and the updated version of the Linux and Helm Charts installation.
The MR also deprecates threshold
.
Checklist
-
merge feat(notifications): add backoff sink with maxr... (!1606 - merged) -
Add maxretries
to Omnibus -
Add maxretries
to Charts -
Update docs and add note to https://docs.gitlab.com/ee/administration/packages/container_registry.html#configure-container-registry-notifications
Edited by Jaime Martinez