service_up? and service_down? methods of OmnibusHelper are acting incorrectly
OmnibusHelper.service_up?
and OmnibusHelper.service_down?
methods have gone through changes since their implementation (list follows, in reverse chronological order)
However, we saw recently that it may not be doing what we want
root@ibaum-dev-01:~# /opt/gitlab/init/postgresql status
run: postgresql: (pid 17301) 9s; run: log: (pid 24342) 517s
root@ibaum-dev-01:~# echo $?
0
root@ibaum-dev-01:~# /opt/gitlab/init/postgresql stop
ok: down: postgresql: 1s, normally up
root@ibaum-dev-01:~# echo $?
0
root@ibaum-dev-01:~# /opt/gitlab/init/postgresql status
down: postgresql: 5s, normally up; run: log: (pid 24342) 528s
root@ibaum-dev-01:~# echo $?
0
root@ibaum-dev-01:~# /opt/gitlab/init/postgresql start
ok: run: postgresql: (pid 17387) 0s
root@ibaum-dev-01:~# echo $?
0
root@ibaum-dev-01:~#
The exit code won't be non-zero if the service is down. It won't even be non-zero if we are checking a non-existent service
balasankar@dev:~$ sudo gitlab-ctl status asdf
balasankar@dev:~$ echo $?
0
This is because, gitlab-ctl status
uses two things for execution
-
run_command
method, which uses$?
that doesn't actually return the exit code ofsystem
command. -
/opt/gitlab/init/<service> status
, which will return exit code 0 if it ran successfully. sv manpage says the following
sv exits 0, if the command was successfully sent to all services, and, if it was told to wait, the command has taken effect to all services.
My understanding is that we use service_up?
to check if service is actually running or not. That is not done by the current implementation.