Draft: PoC: Pods Stateless Router Proposal (Iteration 2)
TL;DR
This is a proof of concept trying to model some aspects Pods Stateless Router Proposal
as described in !102553 (merged).
What it does?
Iteration 1 (concluded with recording on 12th of October)
Look here !102770 (closed).
Iteration 2 (this MR, in progress)
Those are additional changes implemented in this PoC to continue solution validation. Yet to be presented.
- Move away from Pod selector on Performance Bar: The intent is to make as many routes to be classified to a given Pod automatically. Performance Bar would show in such case which Pod is used, but in most cases the ability to change Pod does stop working.
-
Router: Implement router to send pre-flight requests and
path_info
classification on Rails side: !102553 (merged). - GitLab-Shell and Gitaly: Fix support for Git Push and make it work with Router.
-
PreventClusterWrites: Implement mechanism to model
async writes
approach: where onlyPod 0
can write to cluster-wide tables. - QA: Work on fixing as many QA tests as possible.
This follows write-async approach: where only Pod 0 can write to cluster-wide tables.
- A
pod_N
always uses DB replica ofcluster-wide
tables and is expecting to observe latency on those tables to up-to 500ms => a region-first approach - A set of cluster-wide tables is under
public
schema - A pod specific tables is under
pod_0
andpod_1
schema - A Rack/Sidekiq Middleware is added to configure
connection.schema_search_path = "public,pod_0|pod_1"
depending onselected_pod
Cookie to model switching organizations - Only
pod_0
can write tocluster-wide
tables: this is enforced byPreventClusterWrites
Query Analyzer - The
pod_N
forwards write calls via API topod_0
: this is currently modelled by suppressingPreventClusterWrites
in a place where the write happens to identify places required to be changed - Some endpoints are forced by served by
pod_0
, like/admin
, or/-/profile
that require cluster-wide access.
What problems it ignores as we know that we can solve them?
-
Decompose cluster-wide tables: We know that we can decompose all
cluster-wide
tables (as we did that for CI decomposition). The biggest problem there is fixing all cross-join schemas. Using a single logical database with separate PostgreSQL schemas (cluster+pod_0
orcluster+pod_1
) allows to retain all existing cross-joins working, but still create a separate visibility between tables. -
Monotonic sequences: We know that we can handle ID sequences across all Pods in a non-conflicting way for things like
projects.id
orissues.id
. This PoC makes all PostrgreSQL sequences to be shared across allpod_0/pod_1
. - Loose foreign keys: The loose foreign keys needs to be updated to allow removal across different Pods
-
Partitioning: The partitioning code use
gitlab_partitions_dynamic
and gitlab_partitions_static. Since this is not compatible with
context, pod_N` approach all partitioned tables are for now converted into non-partitioned. - Sidekiq Cron: Only regular Sidekiq Workers are covered. In future each Pod would have its own Sidekiq Cron executor.
Problems to evaluate
-
Router: Current approach uses a single GitLab, pass a Cookie, and dynamically
search_schema_path
depending on selected Pod. Ideally router should understand (Workhorse?) or have a logic to route a request to a correct Pod based on information from GitLab Rails - Cross-Pod talking: a. fetch data from another Pod (like Project) b. aggregate data across all Pods c. schedule Sidekiq job in a context of another Pod d. route all requests (Controller, GraphQL and API) requests to correct Pod automatically
- Many versions of GitLab: A truly Pod architecture allows to run many different versions of GitLab at the same time, allowing to upgrade some customers less frequently than others, and in thus improving resiliency due to application bugs. In a model of decomposed shared cluster-wide tables this might not be possible, since we would require all nodes to run the same latest version of application if cluster-wide tables were updated.
- ...
Run it
- Configure
config/database.yml
withschema_search_path:
, ideally using a new DB - Run
scripts/decomposition/create-pods-database
- Run
bin/rake -t db:seed_fu
to seed development database - (Optionally) Run
scripts/decomposition/classify-pods-database
to fetch test DB and updategitlab_referenced.yml
# config/database.yml
development:
main:
database: gitlabhq_development_pods
schema_search_path: public,pod_0
ci:
database: gitlabhq_development_pods
database_tasks: false
schema_search_path: public,pod_0
Edited by Kamil Trzciński (Back 2025-01-01)