Async index creation

Review changes
Download
Patches
Plain diff

Andreas Brandl requested to merge ab/async-index-creation into master Jul 20, 2021

Overview 99
Commits 5
Pipelines 13
Changes 17

What does this MR do?

This introduces "async index creation". It aims to offload actual index creation to a more convenient time, e.g. during low-traffic times. We piggy-back on the reindexing cronjob that runs on the weekends and prioritize async index creation over any reindexing activity (this is a hack, potentially should move to a separate cronjob).

Async index creation works in two phases:

Prepare - this is a regular migration to prepare the index, but it does not create it. Instead, it creates a record in postgres_async_indexes with the actual index definition
Create index - this is a regular migration using add_concurrent_index to create the index. If it already exists (due to preceding async index creation), this is a no-op. If it doesn't, it creates the index normally as one would expect. In both cases, we remove the record from postgres_async_indexes.

In that sense, async index creation is "best effort" and optional. If it's not happening async, it'll happen with (2). This doesn't change the time for when you can start relying on the presence of the index - it's still only after add_concurrent_index has been run.

Scope of usage

This is a developer convenience tool aimed for use on GitLab.com only. It is only relevant in the GitLab.com environment, which sees highly frequent deploys and where we can't afford blocking the deploy for many hours to create some indexes.

The workflow is for developers only and requires a manual checkpoint: Has the index been created asynchronously already? If yes, we can merge the second migration that has the add_concurrent_index steps (which are then a no-op). If that goes wrong, i.e. if we merge too early, we would see the index creation block the deploy (as expected).

Background

In context of primary key int4 -> int8 migrations, we need to create a lot of indexes on large tables in %14.2. This can be a manual effort, e.g. doing through the change management process to create them - however, that causes quite a bit of overhead to coordinate.

This change introduces tooling to automate it: We can simply ship migrations to "prepare" indexes, wait for them to be created over the weekend and come back and continue with dependent changes the week after.

It is not a strict alternative to using the change management process, but rather an additional means to getting those indexes created - so whatever turns out to be quicker, we can use that. I expect we'll have more situations like this in the future, so I'd deem this useful in any case.

Status:

Works on my machine 😸

Does this MR meet the acceptance criteria?

Conformity

I have included changelog trailers, or none are needed. (Does this MR need a changelog?)
I have added/updated documentation, or it's not needed. (Is documentation required?)
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?)
I have self-reviewed this MR per code review guidelines.
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines)
I have followed the style guides.
This change is backwards compatible across updates, or this does not apply.

Availability and Testing

I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.)
I have tested this MR in all supported browsers, or it's not needed.
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.

Edited Aug 02, 2021 by Patrick Bair

Merge request reports

Assignee Loading

Reviewers Loading

Request review from

Loading

Time tracking Loading

Loading