Error upgrading to gitlab 11.11
Summary
When upgrading postgresl in gitlab 11.11 it crashed
Steps to reproduce
- install gitlab ce omnibus 11.10
- upgrade to 11.11
- run sudo gitlab-ctl pg-upgrade
What is the current bug behavior?
Got an error in the upgrade process, the gitlab service remains in "Deploy in progress"
What is the expected correct behavior?
Upgrade should be successful
Relevant logs and/or screenshots
Checking if PostgreSQL bin files are symlinked to the expected location: OK
Waiting 30 seconds to ensure tasks complete before PostgreSQL upgrade.
Please hit Ctrl-C now if you want to cancel the operation.
Toggling deploy page:cp /opt/gitlab/embedded/service/gitlab-rails/public/deploy.html /opt/gitlab/embedded/service/gitlab-rails/public/index.html
Toggling deploy page: OK
Toggling services:ok: down: alertmanager: 1s, normally up
ok: down: crond: 0s, normally up
ok: down: gitaly: 0s, normally up
ok: down: gitlab-monitor: 0s, normally up
ok: down: logrotate: 1s, normally up
ok: down: node-exporter: 0s, normally up
ok: down: postgres-exporter: 1s, normally up
ok: down: prometheus: 0s, normally up
ok: down: redis-exporter: 1s, normally up
ok: down: registry: 0s, normally up
ok: down: sidekiq: 0s, normally up
Toggling services: OK
Stopping the database:ok: down: postgresql: 1s, normally up
Stopping the database: OK
Symlink correct version of binaries: OK
Creating temporary data directory:Traceback (most recent call last):
11: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in `<main>'
10: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in `load'
9: from /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/omnibus-ctl-0.6.0/bin/omnibus-ctl:31:in `<top (required)>'
8: from /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:746:in `run'
7: from /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:204:in `block in add_command_under_category'
6: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:140:in `block in load_file'
5: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:194:in `general_upgrade'
4: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:152:in `common_pre_upgrade'
3: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:226:in `create_temp_data_dir'
2: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:383:in `progress_message'
1: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:228:in `block in create_temp_data_dir'
/opt/gitlab/embedded/service/omnibus-ctl/lib/gitlab_ctl/pg_upgrade.rb:39:in `run_pg_command': undefined method `[]' for nil:NilClass (NoMethodError)
Instance has space available:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p1 50G 8.2G 42G 17% /
devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs 3.8G 24K 3.8G 1% /dev/shm
tmpfs 3.8G 390M 3.4G 11% /run
tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup
tmpfs 772M 0 772M 0% /run/user/1000
tmpfs 772M 0 772M 0% /run/user/0
Process to resolve issue
Tried the following procedure:
[centos@ip-10-2-0-69 ~]$ sudo su - gitlab-psql
-sh-4.2$ pwd
/var/opt/gitlab/postgresql
-sh-4.2$ mkdir data.10
-sh-4.2$ /opt/gitlab/embedded/postgresql/10/bin/initdb -D data.10/
The files belonging to this database system will be owned by user "gitlab-psql".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory data.10 ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
/opt/gitlab/embedded/postgresql/10/bin/pg_ctl -D data.10/ -l logfile start
-sh-4.2$ /opt/gitlab/embedded/postgresql/10/bin/pg_upgrade -d data -D data.10 -b /opt/gitlab/embedded/postgresql/9.6/bin -B /opt/gitlab/embedded/postgresql/10/bin/
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for invalid "unknown" user columns ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_clog to new server ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
ok
Copying user relation files
ok
Setting next OID for new cluster ok
Sync data directory to disk ok
Creating script to analyze new cluster ok
Creating script to delete old cluster ok
Checking for hash indexes ok
Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
./analyze_new_cluster.sh
Running this script will delete the old cluster's data files:
./delete_old_cluster.sh
-sh-4.2$ mv data data.9
-sh-4.2$ mv data.10 data
-sh-4.2$ ls -la
total 24
drwxr-xr-x. 4 gitlab-psql root 107 May 23 09:19 .
drwxr-xr-x. 22 root root 4096 May 23 09:19 ..
-rwx------. 1 gitlab-psql gitlab-psql 795 May 23 09:18 analyze_new_cluster.sh
drwx------. 19 gitlab-psql gitlab-psql 4096 May 23 09:18 data
drwx------. 19 gitlab-psql root 4096 May 23 09:18 data.9
-rwx------. 1 gitlab-psql gitlab-psql 25 May 23 09:18 delete_old_cluster.sh
-rw-------. 1 gitlab-psql root 52 Mar 4 18:33 .profile
-sh-4.2$ exit
[centos@ip-10-2-0-69 ~]$ sudo gitlab-ctl reconfigure
...
NO errors
...
Then access the site and the "Deploy" page is still there. Removed it with:
sudo rm /opt/gitlab/embedded/service/gitlab-rails/public/index.html
And everything "seems" to work.. any checks I should do?
Edited by Nuno Fernandes