Improve visibility of time spent with external IO
Related issues
Closes gitlab-com/gl-infra/scalability#302 (closed). Depends on https://gitlab.com/gitlab-org/labkit-ruby/-/merge_requests/50, and must wait until bumping labkit verison
What does this MR do?
After Labkit publishes External HTTP notification events powered by ActiveSupport::Notification, those events are captured, stored, and accumulated. Those information are used:
- Add
external-http
to the performance bar. - Add
external_http_count
andexternal_http_duration
fields into Puma logs and Sidekiq logs - Expose some metrics to Prometheus:
-
gitlab_external_http_total
: total numbers of external requests. This metric has two labels:code
, which is the return HTTP code, andmethod
. Request URI and its parts are removed to reduce metric cardinality. -
gitlab_external_http_duration_seconds
: histogram of external request duration. No labels are provided. -
gitlab_external_exception_total
: a counter to expose the number of exception when making such requests.
-
Screenshots (strongly suggested)
Performance bar overview. The warning sign is added when there are too many requests or the total duration exceeds the threshold or any of request duration exceeds the individual threshold.
Performance bar details. For each external request:
- Display duration
- Display request destination
- Display status code or exception
- Display proxy if available
External HTTP Sidekiq logs
External HTTP Puma logs
A portion of exposed prometheus metrics. This screenshot is captured from the /-/metrics
endpoint.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team