Collection and Tracking of Evaluation Outputs
Problem Statement
Each evaluation run is subject to some randomness, and currently, we do not collect and track the evaluation outputs separately, making it difficult to aggregate evaluation runs over time and gain useful insights or track progress.
Exit Criteria
A system to collect the evaluation outputs and track them separately has been implemented, providing useful insights and progress indicators.
Edited by David O'Regan