Assign custom Apdex targets for CI jobs endpoints
Context
We need to set custom Apdex targets for Rails controller and REST API endpoints related to CI jobs as part of our Q4 scalability initiative for our ~SaaS offering: https://gitlab.com/gitlab-org/gitlab/-/issues/343561, gitlab-com/gl-infra/scalability#1315 (closed)
The default SLI is 1 second. We need to determine where we will set a custom SLI (up to 5 seconds). Ideally, high traffic endpoints used directly by users (not as a part of a background process) should have a high priority set (< 1 second). We may need to set low traffic, slow endpoints to a low priority. Follow these steps to find the right priority for each endpoint: https://docs.gitlab.com/ee/development/application_slis/rails_request_apdex.html#how-to-adjust-the-urgency
Technical details
REST API CI jobs endpoints: https://docs.gitlab.com/ee/api/jobs.html
Controller endpoints exist both in job specific controllers like Projects::BuildsController
and in the actions of other CI controllers such as Projects::PipelinesController#builds
and Ci::ExternalPullRequests::CreatePipelineWorker
Instructions on setting custom Apdex targets: https://docs.gitlab.com/ee/development/application_slis/rails_request_apdex.html#how-to-adjust-the-urgency
Implementation table
Description | Link | Milestone | Timeframe |
---|---|---|---|
Assign custom Apdex targets for CI jobs endpoints |
|
%14.9 | FY22Q4 |
Set low urgency as Apdex targets for remaining PE endpoints | #360273 (closed) | %15.0 | FY23Q1 |
Remove ignored_components from PE error budget | #348552 (closed) | %15.0 | FY23Q2 |
Introduce Keyset pagination for GET /api/:version/projects/:id/jobs API endpoint |
#362172 (closed) | %15.5 | FY23Q3 |
Backend: Improve performance of GraphqlController#execute | #361377 | TBD | FY23Q3 |
Backend: Improve performance of PATCH /api/:version/jobs/:id/trace | #353802 | TBD | FY23Q3 |
Assign custom Apdex targets for remaining PE endpoints | #348554 | TBD | FY23Q3 |
Steps
-
1. Locate all CI jobs REST API and controller endpoints - For now, using the Elastic dashboard is the easiest way to get to the list of endpoints. As written in the documentation. So from the Stage Group Dashboard, in the error budget attribution row, follow the "Puma Apdex" link to see a list of endpoints and the duration targets they meet or exceed.
-
2. Select a target duration for each endpoint by consulting the duration table and following the steps for determining a custom duration -
3. Where the duration we want differs from the default (1 second), add a custom target duration to the endpoint. Involve a Scalability team member in the review