change rate limit err to ResourceExhausted
What
Change the returned gRPC code from Unavailable
to ResourceExhausted
when the user reaches concurrency limits.
Why
In gitlab-com/gl-infra/production#8056 (closed) and gitlab-com/gl-infra/production#8071 (closed) we saw a single user paging the on-call because of the high error rate. When we looked at the error rate it was because they were reaching concurrency limits. Rate limiting a user is not an error for us but normal behavior just like a 429 HTTP status code.
In
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16844#note_1175622086
we looked into the best gRPC code to return, and
ResourceExhausted
was the best one where we could differeiencate
between a real server error and a user error. This goes against the grpc
mapping
but also follows what the envoy proxy
does.
- Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16844
- Reference: #4637 (closed)