Update migration helpers to use check constraints instead of change_column_null
What does this MR do?
This MR introduces new migration helpers for managing NOT NULL
constraints and updates existing migration helpers
to use check constraints instead of change_column_null
.
Problem Description
The main issue to be addressed by this MR is described in #38358 (closed):
We want to avoid access exclusive locks in SET NOT NULL migrations.
We had a migration that used add_column_with_default
and attempted to run:
ALTER TABLE 'ci_build_needs' ALTER COLUMN 'artifacts' SET NOT NULL
That locked the table for about 8-10 minutes before it completed, probably because it still had to do a full table scan of 9 million rows.
The proposal is to change how a NOT NULL
constraint on a column is added.
Currently, we simply run change_column_null
which results to the following stament:
ALTER TABLE 'ci_build_needs' ALTER COLUMN 'artifacts' SET NOT NULL
This has two problems:
- It requires an exclusive lock on the table, possibly interfering with traffic
- The validation can take an extended period of time (while holding the lock)
Instead we want to switch that to:
- Add a
NOT VALID
NOT NULL
check constraint for the column - Validate the constraint afterwards
This separates the change of the table (which requires the exclusive lock) from the validation of the constraint (which only needs a SHARE UPDATE EXCLUSIVE
type of lock). The benefit is that we don't hold the exclusive lock for an extended period of time (but we still need it, to add the constraint in the first place).
Implementation details
To achieve the aforementioned solution, this MR:
-
Adds migration helpers for managing
NOT NULL
constraints (add, validate, remove, check if exists)We only need the
add_not_null_constraint
to solve the problem described in the previous section, but we also add all the helpers that will allow us to manageNOT NULL
constraints in various use cases.The most important additional option provided with those helpers is the ability to add a NOT NULL constraint in a migration with
validate: false
, run a data migration to clean up the column and then validate the constraint, similarly to our approach for foreign keys and text limit constraints. -
Update the
disable_statement_timeout
migration helper to allow calling it multiple times, even from inside a block of anotherdisable_statement_timeout
, without resetting the timeout before the outer block finishes.This is a simple but important update introduced in this MR: it allows using migration helpers that disable the statement timeout inside other migration helpers that also disable the statement timeout.
I am going to showcase this in the following section.
-
Update migration helpers to use
add_not_null_constraint
to enforceNOT NULL
for existing columns instead ofchange_column_null
-
Update the Database Guides with a reference to use
add_not_null_constraint
instead ofchange_column_null
Additional updates to the Database Guides will be introduced with #216361 after this MR is merged.
The updates affect both add_column_with_default
and rename_column_concurrently
(as it uses create_column_for
that also adds not null constraints).
Update on disable_statement_timeout
With our current implementation of disable_statement_timeout
, if we were to call a helper that disables the statement timeout (e.g. add_check_constraint
) inside another helper that disables the statement timeout (e.g. add_column_with_default
), then when the first finishes, it resets the statement_timeout
and all commands that come after it (even though we are still in a disable_statement_timeout
block) run without a disabled statement timeout.
Assume the following test migration:
class DelayedValidationOfNotNull < ActiveRecord::Migration[6.0]
include Gitlab::Database::MigrationHelpers
DOWNTIME = false
disable_ddl_transaction!
def up
log_statement_timeout('OUTSIDE of disable_statement_timeout')
disable_statement_timeout do
log_statement_timeout('In disable_statement_timeout')
disable_statement_timeout do
log_statement_timeout('In disable_statement_timeout > disable_statement_timeout')
end
log_statement_timeout('In disable_statement_timeout')
end
log_statement_timeout('OUTSIDE of disable_statement_timeout')
raise 'stop'
end
def down
# no-op
end
private
def log_statement_timeout(msg)
puts "#{msg} | statement_timeout = #{connection.select_value('SHOW statement_timeout')}"
end
end
Our current implementation would result to the following:
OUTSIDE of disable_statement_timeout | statement_timeout = 15s
-- execute("SET statement_timeout TO 0")
In disable_statement_timeout | statement_timeout = 0
-- execute("SET statement_timeout TO 0")
In disable_statement_timeout > disable_statement_timeout | statement_timeout = 0
-- execute("RESET ALL")
In disable_statement_timeout | statement_timeout = 15s
-- execute("RESET ALL")
OUTSIDE of disable_statement_timeout | statement_timeout = 15s
So RESET ALL
is running twice and everything after the first one (end of the inner disable_statement_timeout
block) and the second one (end of the top level disable_statement_timeout
block) runs with the default statement timeout.
After this update:
OUTSIDE of disable_statement_timeout | statement_timeout = 15s
-- execute("SET statement_timeout TO 0")
In disable_statement_timeout | statement_timeout = 0
In disable_statement_timeout > disable_statement_timeout | statement_timeout = 0
In disable_statement_timeout | statement_timeout = 0
-- execute("RESET ALL")
OUTSIDE of disable_statement_timeout | statement_timeout = 15s
That means that we can reuse and call the various migration helpers (especially the ones that add check constraints) in our other migration helpers without worrying that we'll break their behavior.
Test run showcasing that this update properly addresses the problems described
Assume the following test migration:
class DelayedValidationOfNotNull < ActiveRecord::Migration[6.0]
include Gitlab::Database::MigrationHelpers
DOWNTIME = false
disable_ddl_transaction!
def up
add_column_with_default(
:issues,
:new_not_null_column,
:boolean,
default: false,
allow_null: false
)
end
def down
remove_column :issues, :new_not_null_column
end
end
Checking that this update works:
$ bundle exec rake db:migrate
== 20200501164742 DelayedValidationOfNotNull: migrating =======================
-- transaction_open?()
-> 0.0000s
-- execute("SET statement_timeout TO 0")
-> 0.0004s
-- transaction()
-- add_column(:issues, :new_not_null_column, :boolean, {:default=>nil})
-> 0.0021s
-- change_column_default(:issues, :new_not_null_column, false)
-> 0.0048s
-> 0.0081s
-- columns(:issues)
-> 0.0023s
-- transaction_open?()
-> 0.0000s
-- exec_query("SELECT COUNT(*) AS count FROM \"issues\"")
-> 0.0017s
-- exec_query("SELECT \"issues\".\"id\" FROM \"issues\" ORDER BY \"issues\".\"id\" ASC LIMIT 1")
-> 0.0005s
-- exec_query("SELECT \"issues\".\"id\" FROM \"issues\" WHERE \"issues\".\"id\" >= 1 ORDER BY \"issues\".\"id\" ASC LIMIT 1 OFFSET 23")
-> 0.0004s
-- execute("UPDATE \"issues\" SET \"new_not_null_column\" = FALSE WHERE \"issues\".\"id\" >= 1 AND \"issues\".\"id\" < 24")
-> 0.0030s
[ ... ... ...]
-- execute("UPDATE \"issues\" SET \"new_not_null_column\" = FALSE WHERE \"issues\".\"id\" >= 438")
-> 0.0006s
-- transaction_open?()
-> 0.0000s
-- execute("ALTER TABLE issues\nADD CONSTRAINT check_5463530bb7\nCHECK ( new_not_null_column IS NOT NULL )\nNOT VALID;\n")
-> 0.0024s
-- execute("ALTER TABLE issues VALIDATE CONSTRAINT check_5463530bb7;")
-> 0.0008s
-- execute("RESET ALL")
-> 0.0002s
== 20200501164742 DelayedValidationOfNotNull: migrated (0.2029s) ==============
$ git diff db/structure.sql
@@ -3384,7 +3384,9 @@ CREATE TABLE public.issues (
external_key character varying(255),
- sprint_id bigint
+ sprint_id bigint,
+ new_not_null_column boolean DEFAULT false,
+ CONSTRAINT check_5463530bb7 CHECK ((new_not_null_column IS NOT NULL))
);
$ gdk psql
gitlabhq_development=# \d+ issues;
...
Check constraints:
"check_5463530bb7" CHECK (new_not_null_column IS NOT NULL)
...
gitlabhq_development=# \q
$ bundle exec rake db:rollback
== 20200501164742 DelayedValidationOfNotNull: reverting =======================
-- remove_column(:issues, :new_not_null_column)
-> 0.0021s
== 20200501164742 DelayedValidationOfNotNull: reverted (0.0022s) ==============
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team