Skip to content

Add models for Virtual Registries, part 2/2

David Fernandez requested to merge 467972-vreg-db-models-part-2 into master

🔭 Context

With Maven virtual registry MVC (single upstream and... (&14137), we're starting the work on Virtual Registries. Virtual Registries is a feature that could be described as the evolution of the dependency proxy idea: having the GitLab instance play man in the middle between clients and artifacts registries. Artifacts can be any kind but we're going to focus on packages and container images, starting with Maven packages specifically.

In other words, the GitLab instance can be configured to contact a set of upstreams and expose a specific virtual registry url that "talks" the artifact type API, in this case the Maven API. When a request hits this API, we'll check with the set of upstreams and the first one to answer successfully "wins". We will pull the response from that upstream, cache it in the GitLab instance and return it to the client.

The benefits are:

  • multiple upstreams are aggregated behind a single url = simpler configuration on the clients.
  • by caching requests and using those caches in subsequent (identical) requests, we improve the reliability of the system. If the related upstream is down but we have all the correct caches in GitLab, then a client pulling dependencies for a project will work.
  • dependency firewall features. The GitLab instance can do more than just caching. We could run a vulnerability existence check so that we don't allow vulnerable dependencies enter the system.

👣 First iteration's scope

The scope of this feature being quite large, we reduced it for the first iteration. Here are the main aspects:

  • Will work at (root) Group level.
  • Maven packages only.
  • Restrictions on the associations counts:
    • A (root) Group can only have 1 registry (of type Maven).
    • A (maven) registry can only have 1 upstream.

The implementation that we start here should be able to host the evolutions of those restrictions:

  • Support to have the Virtual Registry at a different level (such as Organisation).
  • Support for other package formats.
  • Support for other artifact types than packages, namely container registries.
  • Support for multiple registries.
  • Support for multiple upstreams.
    • Support for different upstream types: local vs remote.

See the detailed analysis in #457503 (comment 1949349752).

💽 Database tables and models

This MR is part of Maven Virtual Registry: Database models (#467972 - closed) which tackles the database tables and models that we will need.

classDiagram
    class Reg["VirtualRegistries::Packages::Maven::Registry"]
    class RegU["VirtualRegistries::Packages::Maven::RegistryUpstream"]
    class U["VirtualRegistries::Packages::Maven::Upstream"]
    class CR["VirtualRegistries::Packages::Maven::CachedResponse"]

    Reg "1" --> "1" RegU
    RegU "1" --> "1" U
    U "1" --> "0..*" CR

As discussed above, several associations are 1:1 for now but will be changed into 1:n in the future.

One thing to note is that, we specialize the tables by the artifact type and subtype, in this case packages and maven. This is because we want to avoid the situation that we have in the grouppackage registry, where tables packages_packages and packages_package_files holds data for packages registries for all package formats. Thus, this is similar to splitting the data by artifact type and subtype.

Moreover, some package formats can have specific settings (such as how to handle the caching part on specific requests (metadata)). It wouldn't make sense to have these settings available in package formats that don't need them (if we were using one table for all formats).

This MR introduces the last table. All the others were introduced in Add models for Virtual Registries, part 1/2 (!156930 - merged).

What does this MR do and why?

  • Add VirtualRegistries::Packages::Maven::CachedResponse.
    • Link them with VirtualRegistries::Packages::Maven::Upstream
    • Set up object storage links/references
  • Add the cached response uploader
  • Add the related specs.

Obviously, the entire feature is behind a feature flag but since the models are not connected to any logic (yet), the feature flag has not been introduced in this MR.

🏎 MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

🦄 Screenshots or screen recordings

🤷

How to set up and validate locally

The only way to play around here is with a rails console.

First, we're going to need the parent objects:

# get a root group
root_group = Group.first

# create the registry
r = ::VirtualRegistries::Packages::Maven::Registry.create!(group: root_group)

# create the upstream
u = ::VirtualRegistries::Packages::Maven::Upstream.create!(group: root_group, url: "https://maven.test")

# create the registry upstream (join table)
ru = ::VirtualRegistries::Packages::Maven::RegistryUpstream.create!(group: root_group, registry: r, upstream: u)

and now, let's create a cached response:

cr = ::VirtualRegistries::Packages::Maven::CachedResponse.create!(group: root_group, upstream: u, relative_path: "test/foo", size: 10)

# we can play with the validations
cr.update!(downloads_count: -1)
ActiveRecord::RecordInvalid: Validation failed: Downloads count must be greater than 0

# can't create another cached response for the same upstream and relative path
cr = ::VirtualRegistries::Packages::Maven::CachedResponse.create!(group: root_group, upstream: u, relative_path: "test/foo", size: 10)
ActiveRecord::RecordInvalid: Validation failed: Relative path has already been taken

💾 Database review

Migration up

main: == [advisory_lock_connection] object_id: 130720, pg_backend_pid: 91631
main: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: migrating 
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- create_table(:virtual_registries_packages_maven_cached_responses, {:if_not_exists=>true})
main: -- quote_column_name(:relative_path)
main:    -> 0.0000s
main: -- quote_column_name(:file)
main:    -> 0.0000s
main: -- quote_column_name(:object_storage_key)
main:    -> 0.0000s
main: -- quote_column_name(:upstream_etag)
main:    -> 0.0000s
main: -- quote_column_name(:content_type)
main:    -> 0.0000s
main:    -> 0.0065s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0160s
main: -- index_exists?(:virtual_registries_packages_maven_cached_responses, [:upstream_id, :relative_path], {:unique=>true, :name=>"idx_vregs_pkgs_mvn_cached_resp_on_uniq_upstrm_id_and_rel_path", :algorithm=>:concurrently})
main:    -> 0.0019s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- add_index(:virtual_registries_packages_maven_cached_responses, [:upstream_id, :relative_path], {:unique=>true, :name=>"idx_vregs_pkgs_mvn_cached_resp_on_uniq_upstrm_id_and_rel_path", :algorithm=>:concurrently})
main:    -> 0.0009s
main: -- execute("RESET statement_timeout")
main:    -> 0.0001s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE virtual_registries_packages_maven_cached_responses\nADD CONSTRAINT check_c2aad543bf\nCHECK ( downloads_count > 0 )\nNOT VALID;\n")
main:    -> 0.0003s
main: -- execute("ALTER TABLE virtual_registries_packages_maven_cached_responses VALIDATE CONSTRAINT check_c2aad543bf;")
main:    -> 0.0003s
main: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: migrated (0.0575s) 

main: == [advisory_lock_connection] object_id: 130720, pg_backend_pid: 91631
ci: == [advisory_lock_connection] object_id: 131120, pg_backend_pid: 91633
ci: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: migrating 
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- create_table(:virtual_registries_packages_maven_cached_responses, {:if_not_exists=>true})
ci: -- quote_column_name(:relative_path)
ci:    -> 0.0000s
ci: -- quote_column_name(:file)
ci:    -> 0.0000s
ci: -- quote_column_name(:object_storage_key)
ci:    -> 0.0000s
ci: -- quote_column_name(:upstream_etag)
ci:    -> 0.0000s
ci: -- quote_column_name(:content_type)
ci:    -> 0.0000s
ci:    -> 0.0072s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci:    -> 0.0004s
ci: -- index_exists?(:virtual_registries_packages_maven_cached_responses, [:upstream_id, :relative_path], {:unique=>true, :name=>"idx_vregs_pkgs_mvn_cached_resp_on_uniq_upstrm_id_and_rel_path", :algorithm=>:concurrently})
ci:    -> 0.0018s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0001s
ci: -- add_index(:virtual_registries_packages_maven_cached_responses, [:upstream_id, :relative_path], {:unique=>true, :name=>"idx_vregs_pkgs_mvn_cached_resp_on_uniq_upstrm_id_and_rel_path", :algorithm=>:concurrently})
ci:    -> 0.0008s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0001s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE virtual_registries_packages_maven_cached_responses\nADD CONSTRAINT check_c2aad543bf\nCHECK ( downloads_count > 0 )\nNOT VALID;\n")
ci:    -> 0.0003s
ci: -- execute("ALTER TABLE virtual_registries_packages_maven_cached_responses VALIDATE CONSTRAINT check_c2aad543bf;")
ci:    -> 0.0002s
I, [2024-07-12T20:53:39.196976 #91482]  INFO -- : Database: 'ci', Table: 'virtual_registries_packages_maven_cached_responses': Lock Writes
I, [2024-07-12T20:53:39.197228 #91482]  INFO -- : {:method=>"with_lock_retries", :class=>"gitlab:db:lock_writes", :message=>"Lock timeout is set", :current_iteration=>1, :lock_timeout_in_ms=>100}
I, [2024-07-12T20:53:39.197450 #91482]  INFO -- : {:method=>"with_lock_retries", :class=>"gitlab:db:lock_writes", :message=>"Migration finished", :current_iteration=>1, :lock_timeout_in_ms=>100}
ci: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: migrated (0.0316s) 

ci: == [advisory_lock_connection] object_id: 131120, pg_backend_pid: 91633

Migration down

main: == [advisory_lock_connection] object_id: 130220, pg_backend_pid: 90812
main: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: reverting 
main: -- drop_table(:virtual_registries_packages_maven_cached_responses)
main:    -> 0.0051s
main: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: reverted (0.0091s) 

main: == [advisory_lock_connection] object_id: 130220, pg_backend_pid: 90812

ci: == [advisory_lock_connection] object_id: 130220, pg_backend_pid: 91216
ci: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: reverting 
ci: -- drop_table(:virtual_registries_packages_maven_cached_responses)
ci:    -> 0.0045s
ci: == 20240712172152 CreateVirtualRegistriesPackagesMavenCachedResponses: reverted (0.0127s) 

ci: == [advisory_lock_connection] object_id: 130220, pg_backend_pid: 91216
Edited by David Fernandez

Merge request reports

Loading