Telemetry: create dashboards
What we have now
We have collected some single-node instances topology data through usage ping. The data are saved in versions database usage_data table. It then goes through the ETL process from version DB to snowflakes.
This dashboard shows raw data of object_store
and topology
. It is used to validate the collected data. There is also version controller to query by id(or uuid): https://version.gitlab.com/usage_data/<id, or UUID>
Create dashboards from these data
Though we will continue to validate and enhance the data, there are ~4000 data today that likely can show us something interesting.
We want to create some useful dashboard from these data so:
- to validate the data make sense from different breakdown views, so it gives us the confidence of NSM metric calculated from these data
- whether there is obvious wrong data
- whether there is missing data
- to discover topology statistics which will help make informed decisions
We can start with something like, thanks @mkaeppler :
- histogram of users with node count (that will show outliers and should break out nicely when we launch multi-node)
- by-service breakdown of process count
- a fixed app server type chart
- maybe something like percentiles of memory used per service
- some aggregation/clustering of uname stats so that we get an idea of how many linuxes we run and which versions etc
Then we would add the NSM chart.
We will need to have editor access to create/edit Sisense dashboard. Or we have to ask Data team to create each of them, which might be tedious for them -- since the dashboard will evolve for a while before it reaches the final version.