Exceptions not properly handled in Snowplow Product Analytics Worker
Overview
When initializing a project with Product Analytics (with the snowplow flag enabled), the retry mechanism isn't working correctly. This causes failures that were caused by an exception to track the error in Sentry and not retry as expected, or remove the lock from the project after a couple of attempts.
In theory, after 2 retries, the job should fail and remove the lock allowing the user to retry again. Both @dennis and @jheimbuck_gl have noted this behaviour.
Whilst we track errors and failures in Sentry (https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/product_analytics/initialize_snowplow_product_analytics_worker.rb#L38), we don't then fail the job.
Suggested fix
- After tracking any exceptions to sentry (https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/product_analytics/initialize_snowplow_product_analytics_worker.rb#L41, https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/product_analytics/initialize_snowplow_product_analytics_worker.rb#L41), use the
track_and_raise_exception
method instead oftrack_and_raise_for_dev_exception
so that the exception is raised both in dev and production. (This is why we missed this bug, since it worked in development.)
Edited by Max Woolf