Assess Operator Chart version logic
Problem summary
The Operator appears to sometimes pick the wrong version of the chart when looking for the latest version. This leads to unexpected errors in the Operator container.
Working scenario: specifying the correct latest version
In this scenario, we specifically tell the Operator which version is the latest:
CHART_VERSION='4.9.1' TAG=a507be0d CLEANUP=no ./scripts/test.sh
...
manager 2021-02-25T21:06:26.591Z DEBUG template Looking for the designated GitLab Chart in the specified directory. {"chartVersion": "4.9.1", "directory": "/charts"}
Note that it correctly looks for version 4.9.1. The test.sh
script passes in this case, as all workloads start as expected.
Broken scenario: letting the operator pick the latest version
If we remove the CHART_VERSION
variable and let the Operator pick the latest version:
TAG=a507be0d CLEANUP=no ./scripts/test.sh
...
manager 2021-02-25T21:08:54.618Z DEBUG template Looking for the designated GitLab Chart in the specified directory. {"chartVersion": "4.8.4", "directory": "/charts"}
Note that the Operator incorrectly looks for version 4.8.4 instead of 4.9.1. And in this case, the Controller returns an error later on:
manager 2021-02-25T21:09:48.413Z ERROR controller Reconciler error {"reconcilerGroup": "apps.gitlab.com", "reconcilerKind": "GitLab", "controller": "gitlab", "name": "gitlab", "namespace": "gitlab-system", "error": "cluster-scoped resource must not have a namespace-scoped owner, owner's namespace gitlab-system"}
That error is not explicitly relevant to the issue here, but it points out that different versions of the Chart are obviously different, and running the same version of the Operator against different versions of the chart can yield different results.
Proposal
In the short term, we should revisit the logic picking the latest version to confirm it is sound. If so, we should find out why the Operator isn't picking the latest version - maybe related to the Makefile
and test.sh
logic chain.
Longer term, I think we should consider specifying a single version of the GitLab Helm chart to bundle with each version of the Operator. Instead of using a script that will detect the last 3 minor versions of GitLab and include all 3 in the Operator image (https://gitlab.com/gitlab-org/gl-openshift/gitlab-operator/-/merge_requests/68), we should specify a static version of the Helm Chart to include.
The main benefit here is consistent test results regardless of time. For example, in its current state, we could run master
today and see tests pass, and then run master
an hour later and if a new Chart version was released in that time, then the pipeline would fail.
In general, we should consider treating the GitLab Helm Chart like any other software dependency and pin to a specific version. When new versions of the Chart come out, we can bump the version in an MR and confirm via our tests if the new Chart version is compatible with the Operator or if changes to the Operator are required.
Open to any thoughts here! Thanks.
/cc @WarheadsSE @pursultani @dustinmm80 FYI