Investigate required Configuration for Cloud-based backup
Context
With Portable Backups we rely on the existing configuration information that is retrieved by SourceContext
and OmnibusContext
. They provide the tool with access to required connection params and credentials, along-side to where blobs are stored.
in Cloud-based backups, we need to integrate with each Cloud vendor instead.
Here we care about the following things:
- Which managed service stores each specific data-type
- What type of action we need to perform in each one to preserve data
- What type of action we need to perform in each one to restore data
- Store each service-data-reference that is part of a backup session
Proposal
Based on the MVC implementation of Object Storage Backup: #455385 (closed), identify which type of information the tool requires in order to figure out what needs to be backup and to where.
Here are some questions / suggestions to guide the work:
- How to we identify which Object Storage endpoints we need to backup?
- How do we link an Object Storage endpoint to a specific data type?
- What should we do to prevent configuration mistakes?
- Should we perform some configuration validation step?
- How can we verify configuration points to the correct data type (ex: a
artifacts
configuration actually points toartifacts
and not something else)
- In the initial phase we should consider relying on the Object Storage configuration using the Consolidated format: https://docs.gitlab.com/ee/administration/object_storage.html#configure-each-object-type-to-define-its-own-storage-connection-storage-specific-form
- Do we see any challenge in later on supporting the non-consolidated format?
- If each blob has its own Object Storage, does the approach from #455385 (closed) support that model? (N:1 where N is the source and 1 is the backup bucket?)
- What type of credential format do we need to access/perform and restore a backup?
- How can we validate we have the correct credentials with the correct permissions?
- Should we build a credential validation command?
- Should such command logic execute prior to each backup?
- Should such command logic execute prior to each restore?↵
- With the
gitlab-backup-cli
tool being decoupled from the Rails codebase, should we consider storing the required configuration in a specific file for the tool, instead of relying on extracting information from the places where it may already exist?↵- Does that approach aid us in integrating with Kubernetes / Helm charts?
Edited by Kyle Yetter