Expensive query on /admin/applications times out - Remove "clients" column
Summary
There is a query on /admin/applications
that is preventing the page from loading (it produces a 500 error).
Impact
This makes it impossible to perform changes to any of our applications in production. A recent example of such a change: gitlab-com/gl-infra/production#7270 (closed).
Analysis
The query in question is:
/*application:web,correlation_id:01G77F4G77Q579EYMQA3HA6760,endpoint_id:Admin::ApplicationsController#index,db_config_name:main_replica*/ SELECT COUNT(DISTINCT "oauth_access_tokens"."resource_owner_id") AS count_resource_owner_id, "oauth_access_tokens"."application_id" AS oauth_access_tokens_application_id FROM "oauth_access_tokens" WHERE "oauth_access_tokens"."application_id" IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13) GROUP BY "oauth_access_tokens"."application_id"
A query plan (EXPLAIN ANALYZE
) can be seen here: https://explain.depesz.com/s/yg6i.
It appears the page is trying to calculate a unique count of resource owners per token for the set of applications on that page. In doing so, it spends almost 2 minutes scanning over a million rows, reading 2 GB of data from disk.
Recommendation
We should not try to run that query, as computing the count is prohibitively expensive. If we must, we should at the very least degrade gracefully if those counts cannot be loaded, and display the rest of the page without counts instead.
The proposal in this case is to remove the column from the UI and subsequently the query.
Verification
This can be verified by visiting https://gitlab.com/admin/applications with an admin account.