Resolve "Logs transitions and errors for BatchedJob"
What does this MR do and why?
In this MR, we are logging every state transition for batched jobs. If a job fails, we also store the exception raised. This information can be helpful during debugging.
I decided to use a state machine for the batched job model to achieve this.
Why are we using state_machines-activerecord
gem?
- We are already using this gem. We know that it is stable
- Events/states are different concepts
- Supports Observers
- Supports transition/state definition
- Supports specific callbacks for transitions
Specific use for batched jobs:
Logging:
after_transition do |job, transition|
exception = transition.args.find { |arg| arg[:error].present? }
Gitlab::ErrorTracking.track_exception(exception[:error], batched_job_id: job.id) if exception
Gitlab::AppLogger.info(message: 'BatchedJob transition', batched_job_id: job.id, previous_state: transition.from_name, new_state: transition.to_name)
end
We can use the after_transition
callback to log every transition + errors.
State definition:
state_machine :status, initial: :active do
state :pending, value: 0
state :running, value: 1
state :failed, value: 2
state :succeeded, value: 3
event :succeed do
transition [:pending, :running, :succeeded, :failed] => :succeeded
end
event :failure do
transition [:running, :failed, :succeeded, :pending] => :failed
end
event :run do
transition [:failed, :pending, :running, :succeeded] => :running
end
end
We can define specific rules for each transition.
Better way to handle transitions:
Currently:
tracking_record.status = :failed
tracking_record.finished_at = Time.current
tracking_record.save!
with state machine:
tracking_record.failure!(error: error)
state_machine do
before_transition do |migration, transition|
migration.finished_at = Time.current if transition.event == :failure
end
end
Documentation: https://www.rubydoc.info/github/state-machines/state_machines-activerecord/StateMachines/Integrations/ActiveRecord
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #346359 (closed)