Add GraphQl option to sort CI catalog resources by usage count
What does this MR do and why?
This MR adds the ability to sort catalog resources by component usage count in GraphQl. It also exposes the column last_30_day_usage_count
as a GraphQl field in alpha
.
The value of this column is currently 0
for all catalog resources. It will be populated as part of #452545 (closed).
Database migration:
This MR also adds an index on last_30_day_usage_count
to improve the query performance when sorting by this column. Since only published catalog resources need to be sorted, the index has the condition where state = 1
.
There are currently less than 500
rows in catalog_resources
, so a post deploy migration was chosen since the index is non-critical to the feature at this time.
Resolves #452620 (closed)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
- You can seed a few catalog resources in the shell with:
bundle exec rake "gitlab:seed:ci_catalog_resources[<YOUR-GROUP-PATH>, 3, true]"
- Then in Rails console, update the resources'
last_30_day_usage_count
values.
projects = Project.last(3) # These should be the 3 projects created in Step 1
projects.each do |project|
project.catalog_resource.update!(last_30_day_usage_count: rand(1..100))
end
- Go to
http://gdk.test:3000/-/graphql-explorer
and test the following query:
query getSortedCatalogResources {
ciCatalogResources(sort: USAGE_COUNT_DESC) {
nodes {
id
name
last30DayUsageCount
}
}
}
- Confirm that the output is ordered as expected with both
USAGE_COUNT_DESC
andUSAGE_COUNT_ASC
.
Migration
Up
main: == [advisory_lock_connection] object_id: 124500, pg_backend_pid: 38999
main: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: migrating =============
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0170s
main: -- index_exists?(:catalog_resources, :last_30_day_usage_count, {:where=>"state = 1", :name=>"index_catalog_resources_on_last_30_day_usage_count", :algorithm=>:concurrently})
main: -> 0.0035s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0003s
main: -- add_index(:catalog_resources, :last_30_day_usage_count, {:where=>"state = 1", :name=>"index_catalog_resources_on_last_30_day_usage_count", :algorithm=>:concurrently})
main: -> 0.0068s
main: -- execute("RESET statement_timeout")
main: -> 0.0004s
main: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: migrated (0.0548s) ====
main: == [advisory_lock_connection] object_id: 124500, pg_backend_pid: 38999
ci: == [advisory_lock_connection] object_id: 124740, pg_backend_pid: 39003
ci: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: migrating =============
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0006s
ci: -- index_exists?(:catalog_resources, :last_30_day_usage_count, {:where=>"state = 1", :name=>"index_catalog_resources_on_last_30_day_usage_count", :algorithm=>:concurrently})
ci: -> 0.0036s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0003s
ci: -- add_index(:catalog_resources, :last_30_day_usage_count, {:where=>"state = 1", :name=>"index_catalog_resources_on_last_30_day_usage_count", :algorithm=>:concurrently})
ci: -> 0.0070s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0003s
ci: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: migrated (0.0301s) ====
ci: == [advisory_lock_connection] object_id: 124740, pg_backend_pid: 39003
Down
main: == [advisory_lock_connection] object_id: 124140, pg_backend_pid: 48027
main: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: reverting =============
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0128s
main: -- indexes(:catalog_resources)
main: -> 0.0030s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0001s
main: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_last_30_day_usage_count"})
main: -> 0.0020s
main: -- execute("RESET statement_timeout")
main: -> 0.0002s
main: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: reverted (0.0290s) ====
main: == [advisory_lock_connection] object_id: 124140, pg_backend_pid: 48027
ci: == [advisory_lock_connection] object_id: 124140, pg_backend_pid: 48439
ci: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: reverting =============
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0158s
ci: -- indexes(:catalog_resources)
ci: -> 0.0046s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0002s
ci: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_last_30_day_usage_count"})
ci: -> 0.0093s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0005s
ci: == 20240504042340 AddIndexCatalogResourcesOnUsageCount: reverted (0.0508s) ====
ci: == [advisory_lock_connection] object_id: 124140, pg_backend_pid: 48439
Query plans
Query
This is the query from Ci::Catalog::Listing
that is executed when you call the ciCatalogResources
endpoint with USAGE_COUNT_ASC
sorting:
Ci::Catalog::Resource.published.includes(:project).public_or_visible_to_user(current_user).order_by_last_30_day_usage_count_asc
SELECT "catalog_resources".*
FROM "catalog_resources"
WHERE "catalog_resources"."state" = 1
AND (
EXISTS (
SELECT 1 FROM "project_authorizations"
WHERE "project_authorizations"."user_id" = 12867582
AND (project_authorizations.project_id = catalog_resources.project_id)
)
OR catalog_resources.visibility_level IN (0,10,20)
)
ORDER BY "catalog_resources"."last_30_day_usage_count" ASC
Before adding the index
Query plan link: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/28071/commands/87390
- Sorting happens in memory with
quicksort
. This isn't a problem now because there are so few rows but it will be a bigger cost when the dataset grows.
After adding the index
Query plan link: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/28071/commands/87394
- In this query plan, the index is named
index_test4
. - It does not use
quicksort
and only needs to scan the new index.
Related to #452620 (closed)