Update Cube schema's to exclude anonymous users
The analytics data we receive may often not be user-specific due to privacy protections on the users browser or device. As a result, the data we show on the dashboards can frequently be misleading as many users will be grouped together.
To allow users to exclude these anonymous users, we are adding a toggle in the UI and adding a new property into the Clickhouse data.
Therefore, we also need to update the Cube schemas to use segments
or filters
to only return the appropriate set of users. segments
would be useful when it comes to setting up pre-aggregations for anon/non-anon users.
Any changes made to the schemas should be copied to the analytics stack upon merging within the DevKit.
Proposed technical solution
As a user can only ever be identified or anonymous, as well as this filter being meant to be applied to all queries when the UI toggle occurs, it makes the most sense to use an anonymousUsers segment
. This also allows us to combine with data blending. Which would mean in future we could support comparing identified and anonymous users within the same visualization, or having both types of users within the same dashboard.
The proposed approach also means we could add additional segments if comparing different types of users becomes more of a product feature.
An example of a possible blended query to compare page views between all users and anonymous users:
[
{
"measures": [
"TrackedEvents.pageViewsCount"
],
"timeDimensions": [
{
"dimension": "TrackedEvents.derivedTstamp",
"granularity": "day"
}
],
"segments": [
"TrackedEvents.anonymousUsers"
]
},
{
"measures": [
"TrackedEvents.pageViewsCount"
],
"timeDimensions": [
{
"dimension": "TrackedEvents.derivedTstamp",
"granularity": "day"
}
],
}
]
infrastructure - 2️⃣
Implementation plan - - Update usages of
domain_userid
touser_id
now we've got a single field for all user ID's. - Create a new segment to limit by the
anonymous
value in theuser_id_type
field:segments: { knownUsers: { sql: `${CUBE}.user_id_type IN ['cookie', 'identify']` } },
- Add the new segment to the
TrackedEvents
schema. - Deploy to the devkit.
- Deploy to the analytics stack.