Migrate email_campaign_counts to use Metric Instrumentation class
What does this MR do and why?
Related to #339444 (closed)
We want to move metrics-related code from our chunky UsageData class into a system of yml files including metrics definitions.
This is the bash script that has been used to update the yml files:
metrics_names=("in_product_marketing_email_create_0_sent" "in_product_marketing_email_create_0_cta_clicked" "in_product_marketing_email_create_1_sent" "in_product_marketing_email_create_1_cta_clicked" "in_product_marketing_email_create_2_sent" "in_product_marketing_email_create_2_cta_clicked" "in_product_marketing_email_verify_0_sent" "in_product_marketing_email_verify_0_cta_clicked" "in_product_marketing_email_verify_1_sent" "in_product_marketing_email_verify_1_cta_clicked" "in_product_marketing_email_verify_2_sent" "in_product_marketing_email_verify_2_cta_clicked" "in_product_marketing_email_trial_0_sent" "in_product_marketing_email_trial_0_cta_clicked" "in_product_marketing_email_trial_1_sent" "in_product_marketing_email_trial_1_cta_clicked" "in_product_marketing_email_trial_2_sent" "in_product_marketing_email_trial_2_cta_clicked" "in_product_marketing_email_team_0_sent" "in_product_marketing_email_team_0_cta_clicked" "in_product_marketing_email_team_1_sent" "in_product_marketing_email_team_1_cta_clicked" "in_product_marketing_email_team_2_sent" "in_product_marketing_email_team_2_cta_clicked" "in_product_marketing_email_team_short_0_sent" "in_product_marketing_email_team_short_0_cta_clicked" "in_product_marketing_email_trial_short_0_sent" "in_product_marketing_email_trial_short_0_cta_clicked" "in_product_marketing_email_admin_verify_0_sent" "in_product_marketing_email_admin_verify_0_cta_clicked")
for metric_name in "${metrics_names[@]}"
do
find "gitlab-development-kit/gitlab/config/metrics/counts_all" -type f -name "*${metric_name}\.yml" | while read FILE ; do
echo $metric_name
[[ $metric_name =~ in\_product\_marketing_email\_(.*)\_([0-9])\_(.*) ]]
track="${BASH_REMATCH[1]}"
series="${BASH_REMATCH[2]}"
if [ ${BASH_REMATCH[3]} == "cta_clicked" ]; then
only_clicked="true"
else
only_clicked="false"
fi
sed -i '' "s/data_source: database/data_source: database\ninstrumentation_class: InProductMarketingEmailMetric\noptions:\n only_clicked: $only_clicked\n track: $track\n series: $series/" "$FILE"
done
done
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
How to set up and validate locally
This test should make sure that we didn't miss any metrics [eg, it is failing after removing the instrumentation_class:
clause from any of the yml files].
To test that the actual values for events are also counted correctly:
- Checkout
master
branch - Trigger the events tested in this MR. Probably the easiest way to do that is by triggering them manually from the console. Here's a script that I used to get some varied seed data:
n = 1
[[nil, 0], [Date.yesterday, 50]].each do |cta_clicked_at, user_offset|
Users::InProductMarketingEmail.tracks.each do |track_name, _track_id|
series_amount = Namespaces::InProductMarketingEmailsService.email_count_for_track(track_name)
0.upto(series_amount - 1).each do |series|
puts "s #{series} t #{track_name} n #{n}"
n.times do |x|
Users::InProductMarketingEmail.create!(cta_clicked_at: cta_clicked_at, track: track_name, series: series, user: User.order(:id).offset(user_offset + x).first)
end
n += 1
end
n += 1
end
n += 1
end
- Run this in the console:
email_campaign_counts = Gitlab::UsageData.send :email_campaign_counts
events = email_campaign_counts.keys
service_ping = Gitlab::Usage::ServicePingReport.for(output: :all_metrics_values)
values1 = service_ping[:counts].values_at(*events)
- This will save the event statistics under the
values1
variable. Thevalues1
array should include some non-zero values if the events have been successfully triggered. Now what we will need to do is compare it with the result we get with this MR:
- Check out this MR's branch
- Run this in the console:
reload!
service_ping = Gitlab::Usage::ServicePingReport.for(output: :all_metrics_values)
values2 = service_ping[:counts].values_at(*events)
- Now, the
values1
andvalues2
should have the same value. This will prove that the events have been added to Service Ping data correctly.
values2 == values1
=> true
Database review
Note: We've already been loading this data using UsageData class, however, before, the data has been loaded by two grouped queries and with this MR, we will be executing a separate SQL query for each unique "in_product_marketing_emails"."series"
/"in_product_marketing_emails"."track"
pair. There's 30 such pairs.
Index creation: https://gitlab.slack.com/archives/CLJMDRD8C/p1668079202704819
Query plans
InProductMarketingEmailCtaClickedMetric
Note: This metric seems to always return 0
value. That is in correlation with what we have been reporting for it already: https://app.periscopedata.com/app/gitlab/1001959/Micha%C5%82's-testing-dashboard?widget=14826361&udv=0
Select ids for batching:
- min: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46148
- max: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46149
Load results: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46150
InProductMarketingEmailCtaClickedMetric
Select ids for batching:
- min: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46151
- max: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46152
Load first batch: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46153
Load second batch: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46154
-> this is repeated for ids in the range for 12 more batches, each one contains a range of 100 000 ids
Load last batch: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/13161/commands/46155
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.