Fix flaky test `TestDockerCommandRunAttempts`
What does this MR do?
Fix flaky test TestDockerCommandRunAttempts
Why was this MR needed?
Background
When removeContaienr is called the exit code is 137. Which is sometimes picked up by the Runner and that exit code is returned as explained in detail in #25385 (comment 324486793) this causes the test to fail and lead into flaky tests.
Fix
Add testAttempts, which will run the tests x amount of times, increasing the odds of us getting the expected value. If the test attempts are higher there is a high chance that the failure is legit.
The Runner can just retry the job section if the exit code is 137, but
it's useful to show the exit code 137
to the user because this is
mostly the oom killer killing the container as explained in
https://success.docker.com/article/what-causes-a-container-to-exit-with-code-137
and it's useful to show to the user that their job is getting killed by
the oom killer.
Reproduce
It is quite hard to reproduce since it's random and no real way to
provide the exit code properly. The only way that you can is calling
ContainerStop
with the following patch:
https://gitlab.com/snippets/1967002
Does this MR meet the acceptance criteria?
-
Documentation created/updated -
Added tests for this feature/bug -
In case of conflicts with master
- branch was rebased
What are the relevant issue numbers?
Closes #25385