Add Snowplow tracking to Model-Gateway
Problem
We're currently tracking most metrics about our Code Suggestions features in Kibana (see dashboard). While this is useful and accessible to engineers, it's not in line with our normal analytics stack (Snowflake, Sisense Periscope). This makes the current metrics hard to access for stakeholders outside of engineering and to report. Only unique users are currently in available in Sisense (see dashboard) since this can be tracked via Gitlab.com. The model-gateway is currently not instrumented via Snowplow (our event collection solution).
Desired Outcome
We can track the most important metrics for code suggestions within Sisense Periscope:
- Amount of code suggestion requests
- Shown code suggestions
- Accepted code suggestions
- Errored code suggestion requests
- Prompt length
- Suffix length
- Language
- Origin (e.g. which extension, which version)
Proposed Solution
We propose to add Snowplow tracking and use our normal Snowplow infrastructure to track events to enable these metrics and make them available in Sisense. These are suggested steps (feedback welcome):
-
!233 (merged) - Add the Snowplow Python Tracker to the model-gateway setup an AsyncEmitter and initialize it with the following variables on production:
-
endpoint: snowplow.trx.gitlab.net
(our main snowplow collector) -
namespace: snowplow_tracker
(can be discussed) -
app_id: gitlab_ai_gateway
(can be discussed and needs to be coordinated with product data team)
-
-
https://gitlab.com/gitlab-org/iglu/-/merge_requests/87 - Add a new context definition to our schema repository, this allows us to track the evolution of the properties for this event. The following properties are proposed:
{ "request_counts": { "type": "array", "description": "Acceptance, show and error counts for previous requests", "items": { "type": "object", "description": "Acceptance, show and error counts for previous request", "required": ["requests", "errors", "accepts"], "properties": { "requests": { "type": "integer", "description": "Count of completions requests", "minimum": 0, "maximum": 2147483647 }, "errors": { "type": "integer", "description": "Count of failed completions requests", "minimum": 0, "maximum": 2147483647 }, "accepts": { "type": "integer", "description": "Count of accepted completions requests", "minimum": 0, "maximum": 2147483647 }, "lang": { "type": ["string", "null"], "description": "Programming language of the completions request", "maxLength": 32 }, "model_engine": { "type": ["string", "null"], "description": "Model engine used for the completions", "maxLength": 64 }, "model_name": { "type": ["string", "null"], "description": "Model name used for the completions", "maxLength": 64 } } } }, "prefix_length": { "type": "integer", "description": "Length of the prefix in characters", "minimum": 0, "maximum": 2147483647 }, "suffix_length": { "type": "integer", "description": "Length of the suffix in characters", "minimum": 0, "maximum": 2147483647 }, "language": { "type": "string", "description": "Programming language of the completions request", "maxLength": 32 }, "user_agent": { "type": "string", "description": "User-agent string of the request (holds information about the origin of the request)", "maxLength": 255 }, "gitlab_realm": { "type": "string", "description": "Self-Managed or SaaS", "maxLength": 32 } }
- Track a Structured Event with the category
code_suggestions
and the actionsuggestion_requested