Optimize getRepositoryMetadata when querying a single repository
The query in getRepositoryMetadata is currently slower than it needs to be when getting metadata for a single repository. This is due to Postgres ending up reading more rows than it needs to.
When querying with just repository id, Postgres ends up fully reading
both tables in the replicas subquery due to the full join. This is not
necessary, as Postgres would only have to get rows related to the
repository we are fetching metadata for. This happens as Postgres
doesn't push down the repository_id = <id>
filter down through the
join. We can help Postgres here by applying the filter manually to
both sides of the join.
When querying with virtual storage and relative path, Postgres ends up computing the full valid_primaries view and does not realize it could read one row from repositories table and query the view with the repository id from that row. We can again help Postgres by explicitly querying the view with the ID retrieved from the repositories table.
This commit optimizes both cases by shadowing the related tables and applying the filters there as needed. This ensures the tables in the full join and the view are never fully read/computed when we only need the records of a single repository.
Query plans:
GetRepositoryMetadata
: before 2781ms, after 0.437ms
GetRepositoryMetadataByPath
: before 3012ms, after 0.720ms
GetPartiallyAvailableRepositories
: before 6972ms, after 7000ms