Improve Runner group release process - Pipeline performance
Overview
In May 2020, Improve flakyness and speed of pipeline (#25651 - closed) was opened to address issues we faced with our pipeline.
According to a comment at the time, we reduced pipeline durations from ~46 to ~29 minutes.
The pipeline has significantly changed since and there's been a lot of improvements (such as building the helper images before tests, so we're correctly testing everything), more distributions, more tests etc. However, this has had a massive impact on the speed of the pipeline. Today (Mar, 2023) we're at ~2 hours.
This issue is for capturing ideas and solutions for fixing this.
Update as of 28th April 2023:
- Reduced pipelines by about 30 minutes, mostly due to one change: !3997 (merged)
- Fixed a ton of flaky tests.
- All unit race tests have been fixed.
- Windows jobs and race unit tests can no longer be allowed to fail.
- Splitic, the new test runner, is merged: !3967 (merged). This the reduces the number of jobs parallel jobs we were having to create to run tests, as they're now run more efficiently.
Edited by Arran Walker