Update prometheus to 2.0
Updated deliverables
- New installs should default to Prometheus 2.0
- Make it an optional, opt-in, upgrade for existing customers for now
- Provide manual instructions for end-users on how to upgrade while preserving Prometheus data.
- Plan for a required upgrade in 12.0. (With deprecation notice as noted here: #2940 (comment 75777992))
Original issue description
With the release of Prometheus 2.0 today, and it's attendant performance and resource utilization improvements, we should consider the upgrade process.
There are two major breaking changes:
- Alerts configuration changes, which should not impact us since we don't use those today
- Change to the time series database, which causes data loss of all stored metric data
Because the impact is data loss of the 2 weeks of stored metrics, we are looking at a few options:
- Schedule the breaking change for 11.0 and warn people about the loss of their historical data
- As an addition to 1, also support upgrading to 2.0 prior to the 11.0 release. This would allow fresh installs to be on 2.0 going forward, and users to upgrade earlier if they want. It does require packaging both versions along with a new flag in
gitlab.rb
. - Investigate the creation of a migration tool to move data from the existing TSDB to the TSDB 2.0. Initial feasibility investigation is in progress right now by @juliusv
- Stand up Prometheus 2.0 alongside 1.x, set 1.x as the read remote for a period of time, and then eventually stop and remove it. (Current retention period is 2 weeks)
We should know ballpark estimate of 3 soon, and based on that we can make further decisions.
Edited by Balasankar 'Balu' C