Flaky GKE tests
Problem
Some tests on GKE are very unreliable failing for different reasons. This impacts development and shipping of other features when testing on GKE. E.g.:
Sentry
Developing fluentd integration
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/506703763
FAILED RELEASES:
NAME
sentry
in /usr/local/share/gitlab-managed-apps/helmfile.yaml: in .helmfiles[5]: in sentry/helmfile.yaml: failed processing release sentry: helm exited with status 1:
Error: Job failed: BackoffLimitExceeded
Developing Knative integration
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/506024938
FAILED RELEASES:
NAME
sentry
in /usr/local/share/gitlab-managed-apps/helmfile.yaml: in .helmfiles[5]: in sentry/helmfile.yaml: failed processing release sentry: helm exited with status 1:
Error: timed out waiting for the condition
Cilium
Got stuck during the master branch pipeline after merging FluentD
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/506791243
Falco
Failed on master after updating the Runner image tag (completely unrelated). It passed after retrying
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/679554352
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/671081235
certmanager-no-issuer
https://gitlab.com/gitlab-org/cluster-integration/cluster-applications/-/jobs/692778366
Proposal
- As a quick fix I suggest we comment the tests out.
- Then open another MR to try to fix it.