ci: Include test files filtering in Knapsack allocation (!110359) · Merge requests · GitLab.org / GitLab

Rémy Coutable requested to merge include-filtered-tests-in-knapsack-allocation into master Jan 27, 2023

What does this MR do and why?

The goal is to optimize the distribution of the rspec * predictive jobs.

Let's imagine we have this Knapsack report of testfile -> duration:

Test file	Duration
`testfile1`	10
`testfile2`	20
`testfile3`	30
`testfile4`	40
`testfile5`	50

And based on the changes, detect-tests decides that we only need to run testfile1, testfile3, testfile5.

Let's say we parallelize the rspec * predictive jobs on two nodes.

Before

Before this change, the following would happen:

First, Knapsack would generate a list of tests to be run on node1 and node2 without taking in account the list of test files that will actually be run. Based on the test files duration, the distribution would be:

Node	Test files	Total expected duration
`node1`	`testfile5`, `testfile3`	50 + 30 = 80
`node2`	`testfile4`, `testfile2`, `testfile1`	40 + 20 + 10 = 70

Then, our ParallelRSpecRunner wrapper would filter the list of tests further, so that the end result would be as follows:

Node	Test files	Total expected duration
`node1`	`testfile5`, `testfile3`	50 + 30 = 80
`node2`	`testfile1`	10

We can see that the distribution isn't balanced well.

After

With this change, the following would happen:

First, Knapsack would generate a list of tests to be run on node1 and node2 with taking in account the list of test files that will actually be run. Based on the test files duration, the distribution would be:

Node	Test files	Total expected duration
`node1`	`testfile5`	50
`node2`	`testfile3` `testfile1`	30 + 10 = 40

Then, our ParallelRSpecRunner wrapper doesn't have to filter the list of tests further since Knapsack already did the filtering.

We can see that the distribution is balanced way better now.

Next steps

As a further optimization, we should parallelize dynamically based on the number of test files to run, similarly to how we do with the rspec foss-impact child pipeline.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited Jan 27, 2023 by Rémy Coutable

ci: Include test files filtering in Knapsack allocation