Add GRPC client metrics to model-gateway
Corrective action from https://gitlab.com/gitlab-com/gl-infra/production/-/issues/15657.
In incident https://gitlab.com/gitlab-com/gl-infra/production/-/issues/15657, it appears that the GRPC client-side load balancing failed, causing a hot-spot onto two of the 4 triton servers running.
This lead to GPU saturation on those pods and latency spikes and apdex drops on the service.
It is still not known why the GRPC round-robin load balancing failed but this is being investigated.
Proposal
Add GRPC client-side metrics to the model-gateway using the py-grpc-prometheus
library https://pypi.org/project/py-grpc-prometheus/.
This will not fix the problem but may help us understand why certain nodes are dropping out of the server pool.