Geo - Support filtering package files by keyword via GraphQL
What does this MR do and why?
Currently, for self-service replicables, we are unable to do any filtering on the Geo Replicable views.
This MR supports filtering package files registries by keyword (fuzzy_search) via GraphQL:
Registry. | Searchable attributes |
---|---|
Package Files | file_name |
Database
Raw SQL and query plans
Using:
Packages::PackageFile.search('GitLab')
Then:
SELECT
packages_package_files.id
FROM
packages_package_files
WHERE
packages_package_files.file_name ILIKE '%Gitlab%'
LIMIT 1000;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14970/commands/52119
Query plan (without index):
Limit (cost=1000.00..39545.23 rows=1000 width=8) (actual time=4.728..7204.836 rows=1000 loops=1)
Buffers: shared read=29683 dirtied=623 written=622
I/O Timings: read=19971.284 write=21.080
-> Gather (cost=1000.00..6767730.51 rows=175553 width=8) (actual time=4.727..7204.516 rows=1000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=29683 dirtied=623 written=622
I/O Timings: read=19971.284 write=21.080
-> Parallel Seq Scan on public.packages_package_files (cost=0.00..6749175.21 rows=73147 width=8) (actual time=13.432..7190.436 rows=334 loops=3)
Filter: ((packages_package_files.file_name)::text ~~* '%Gitlab%'::text)
Rows Removed by Filter: 353748
Buffers: shared read=29683 dirtied=623 written=622
I/O Timings: read=19971.284 write=21.080
Statistics (without index):
Time: 7.209 s
- planning: 4.258 ms
- execution: 7.205 s
- I/O read: 19.971 s
- I/O write: 21.080 ms
Shared buffers:
- hits: 0 from the buffer pool
- reads: 29683 (~231.90 MiB) from the OS file cache, including disk I/O
- dirtied: 623 (~4.90 MiB)
- writes: 622 (~4.90 MiB)
Then: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14970/commands/52124
Query plan (with index):
Limit (cost=1476.97..3673.71 rows=1000 width=8) (actual time=91.741..931.075 rows=1000 loops=1)
Buffers: shared hit=341 read=903 dirtied=18
I/O Timings: read=828.767 write=0.000
-> Bitmap Heap Scan on public.packages_package_files (cost=1476.97..387387.11 rows=175674 width=8) (actual time=91.738..930.493 rows=1000 loops=1)
Buffers: shared hit=341 read=903 dirtied=18
I/O Timings: read=828.767 write=0.000
-> Bitmap Index Scan using index_packages_package_files_on_file_name (cost=0.00..1433.05 rows=175674 width=0) (actual time=69.862..69.863 rows=132421 loops=1)
Index Cond: ((packages_package_files.file_name)::text ~~* '%Gitlab%'::text)
Buffers: shared hit=341 read=95
I/O Timings: read=6.828 write=0.000
Statistics (with index):
Time: 940.499 ms
- planning: 8.136 ms
- execution: 932.363 ms
- I/O read: 828.767 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 341 (~2.70 MiB) from the buffer pool
- reads: 903 (~7.10 MiB) from the OS file cache, including disk I/O
- dirtied: 18 (~144.00 KiB)
- writes: 0
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #364722 (closed) and #411770
Edited by Javiera Tapia