Skip to content

Show failed jobs in MR pipelines tab part 5

Frédéric Caplette requested to merge fc-add-failed-jobs-in-mr-part-5 into master

What does this MR do and why?

Add failed jobs tract to MR pipelines tab

Behind a FF we add the ability to see failed jobs trace directly inside the pipelines tab of a MR. Failed jobs are only fetched when clicking on the button to expand the table. Then each failed jobs is fetched and can be clicked on to expand the job log and see why the job failed.

IMPORTANT Feature specs are also going to be written in the very last MR (or a separate MR, depending on size) to keep these really focus on FE only.

This is part 5 of 5 in a series of MR!

This is a complex MR 😅 We are adding polling and making sure to only poll when it is relevant. That means that we want:

  • Poll for the job count when the widget is closed to get a rough number of how many jobs there are.
  • We are now showing the "Show failed jobs (count)" at all times, not just when there are failed jobs. This is more consistent and since we needed to poll anyway to make sure that if a pipeline runs that had no failed jobs prior, the menu will update when there are. If we had shown nothing, then the text would have "popped" so it feels much nicer to see the count increment
  • When the widget is opened, we stop polling for count and instead poll for the list of failed jobs and vice versa
  • We only poll if the pipeline is active. This means that if you open the widget and the pipeline is not running, there is no polling! If the state changes, we start/stop polling accordingly
  • There are sadly 2 sources of actions: graphql queries (this widget) vs the rest of the list, which is all in REST. This means that while we are polling, we check for pipeline.active field. If that value changes to false, it means that we should stop polling (the pipeline has finished). However, there are also REST actions like retrying the pipeline or stopping it. This is why we also have a second value isPipelineActive as a prop: if that changes, it means the REST call was sent and we now have a new pipeline state!
  • We are also adding etags caching. This means that if we poll and hte pipeline state has not changed, you will see a 304 response header which is great for performance!

I have made several video that showcase some of these behaviours to help test this MR. Let me know if you need more info! 🙇🏼

MR table:

Title Link
Introduce FF and base component !122914 (merged)
Fetch failed jobs !122917 (merged)
Display each failed job in a row with log !122921 (merged)
Add action button to each job row !123086 (merged)
Add polling when the widget is opened 👈🏻 You are here

this is behind a disabled feature flag: ci_job_failures_in_mr.

Screenshots or screen recordings

Empty state

Screenshot_2023-06-22_at_3.55.29_PM

Widget closed - Polling for job counts

NOTICE: We start at 7 failed jobs. We retry and we poll until we get the same jobs back. The active property becomes false, and we stop polling. Also notice the last call is a 304 where we've hit the etag cache because we already had the latest data.

Screen_Recording_2023-06-22_at_3.23.36_PM

Widget opened - Polling for jobs when retrying a job

NOTICE: We open the widget: there are is no polling because the pipeline is not active. Then we retry a job, the pipeline become active and we poll. Then once we get the only retried job done, the active property becomes false and we stop polling.

Screen_Recording_2023-06-22_at_3.29.32_PM

Widget opened - Polling for jobs with REST actions

NOTICE: We retry a the whole pipeline (REST action), the pipeline become active and we poll. Then once we get the only retried job done, the active property becomes false and we stop polling. Note th efinal 2 calls: This is because since the action is REST, we want to make sure that we get the latest data from graphql as well before we stop polling, so we make one final query for both the jobCount and the jobs.

Screen_Recording_2023-06-22_at_3.31.37_PM

How to set up and validate locally

  1. Enable feature flag in rails console `Feature.enable(:ci_job_failures_in_mr)
  2. Make sure to have a functioning runner
  3. We need a MR with jobs failures. An easy to do so is to make a merge request on the ci config file and add explicit failures
  4. Navigate to Build -> Pipeline Editor
  5. Change your config to include jobs failures like so:
my_failed_job:
  script: exit 1
  1. Create a merge request with that change
  2. Navigate to the Merge request page
  3. Go to the pipelines page
  4. Click on show failed jobs
  5. If your pipeline is running, notice the updates
  6. Click "Retry" on any job
  7. Notice it updates (polling works!)
  8. Retry the entire pipeline
  9. Notice it polls!

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Frédéric Caplette

Merge request reports

Loading