Display metric values per epoch
Model training that is done iteratively within a run through epochs like neural networks needs a charting visualization, i.e., metrics/loss vs epoch; to see if there are issues during training for debugging purposes. From what I understand, MlFlow does log all the metrics/loss per epoch, hence this should be possible. But currently in Gitlab, it only displays the most recent / last metric. This will be a big miss for anyone doing training beyond using classic models.
While graphing is ideal, the first step can be to display a table in the Candidate (/-/ml/candidates/{id}
) displaying metrics per epoch. Epochs are stored on the table ml_candidate_metrics
as the column step
. We can later change this to line graphs.
Implementation Details
Backend
In CandidatesPresenter, metrics
attribute should be a an array of objects containing { name: metric_name, value: metric_value, step: metric_step}
. This doesn't need to be ordered.
Frontend
Change ml_candidate_show
to display a table using GlTableLite
(each row a metrics, each column an epoch) instead of the list of the metrics (keep metadata and parameters display the same)