Run serivce discovery on load balancing configuration
What does this MR do?
This MR forces an initial run of postgres service discovery as soon as load balancing is configured. This solves the problem in #323726 (closed).
Before this MR, puma would preload the application and set up load balancing with an empty host list before forking. Any database access either pre-fork or post-fork before the associated on_worker_start
callback ran service discovery would then query the database primary.
https://gitlab.com/gitlab-org/gitlab/blob/master/config/initializers/load_balancing.rb#L17-27 shows this initialization process.
This MR simply runs service discovery once when load balancing is configured so that we retrieve an initial list of hosts immediately.
Screenshots or Screencasts (strongly suggested)
How to setup and validate locally (strongly suggested)
There are a few steps required to run service discovery locally. This is the easiest way I've found.
- Install
dnsmasq
to run a local dns nameserver.-
brew install dnsmasq
(or your package manager of choice)
-
- Start dnsmasq on port 53 (the default)
-
sudo brew services start dnsmasq
(needs sudo because of the privileged port)
-
- Verify that
dnsmasq
is working-
dig @localhost localhost +short
should return127.0.0.1
-
- Stop your local gdk postgresql
gdk stop postgresql
- Run postgresql directly so that it opens a tcp port.
cd your-gdk-dir/postgresql/data && pg_ctl start -D .
- Add the following production entry to your
database.yml
(this problem is only reproducible with RAILS_ENV=production)
production:
main:
adapter: postgresql
encoding: unicode
database: gitlabhq_development
host: localhost
port: 5432
pool: 10
prepared_statements: false
variables:
statement_timeout: 120s
load_balancing:
discover:
nameserver: localhost
port: 53
record: localhost
record_type: A
interval: 60
disconnect_timeout: 120
To reproduce the problem, without this branch checked out
-
rm log/database_load_balancing.log
so it is clear for the problem. - Stop the running rails server
gdk stop rails-web
- Start puma directly
env RAILS_ENV=production bundle exec puma
- Wait until you see the messages that workers have booted, then shut down the server with
^C
-
cat log/database_load_balancing.log
and see the messages withevent: "no_secondaries_available"
To verify that the problem was fixed, check out this branch, and repeat these steps. The log/database_load_balancing.log
file will not be recreated because no messages will be written to it.
Does this MR meet the acceptance criteria?
Conformity
-
I have included changelog trailers, or none are needed. (Does this MR need a changelog?) -
I have added/updated documentation, or it's not needed. (Is documentation required?) -
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) -
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?) -
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides. -
This change is backwards compatible across updates, or this does not apply.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) -
I have tested this MR in all supported browsers, or it's not needed. -
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.