ci: Include test files filtering in Knapsack allocation
What does this MR do and why?
Reopening of !110359 (merged) after it was reverted.
The goal is to optimize the distribution of the rspec * predictive
jobs.
Let's imagine we have this Knapsack report of testfile -> duration
:
Test file | Duration |
---|---|
testfile1 |
10 |
testfile2 |
20 |
testfile3 |
30 |
testfile4 |
40 |
testfile5 |
50 |
And based on the changes, detect-tests
decides that we only need to run testfile1
, testfile3
, testfile5
.
Let's say we parallelize the rspec * predictive
jobs on two nodes.
Before
Before this change, the following would happen:
- First, Knapsack would generate a list of tests to be run on
node1
andnode2
without taking in account the list of test files that will actually be run. Based on the test files duration, the distribution would be:
Node | Test files | Total expected duration |
---|---|---|
node1 |
testfile5 , testfile3
|
50 + 30 = 80 |
node2 |
testfile4 , testfile2 , testfile1
|
40 + 20 + 10 = 70 |
Then, our ParallelRSpecRunner
wrapper would filter the list of tests further, so that the end result would be as follows:
Node | Test files | Total expected duration |
---|---|---|
node1 |
testfile5 , testfile3
|
50 + 30 = 80 |
node2 |
testfile1 |
10 |
We can see that the distribution isn't balanced well.
After
With this change, the following would happen:
- First, Knapsack would generate a list of tests to be run on
node1
andnode2
with taking in account the list of test files that will actually be run. Based on the test files duration, the distribution would be:
Node | Test files | Total expected duration |
---|---|---|
node1 |
testfile5 |
50 |
node2 |
testfile3 testfile1
|
30 + 10 = 40 |
Then, our ParallelRSpecRunner
wrapper doesn't have to filter the list of tests further since Knapsack already did the filtering.
We can see that the distribution is balanced way better now.
Actual example
Test | Before | After |
---|---|---|
rspec system predictive 1/3 |
12m52s (+1m10s from average) | 14m9s (+1m39s from average) |
rspec system predictive 2/3 |
7m15s (-4m27s from average) | 12m0s (+30s from average) |
rspec system predictive 3/3 |
15m3s (+3m21s from average) | 11m28s (-1m2s from average) |
Average | 11m42s | 12m30s |
Next steps
As a further optimization, we should parallelize dynamically based on the number of test files to run, similarly to how we do with the rspec foss-impact
child pipeline.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.