Force ApplicationRecord#with_fast_statement_timeout to use replicas
What does this MR do?
Closes #322133 (closed)
Although we already have a load balancer layer to redirect read-only queries to the replicas, the effectiveness of this load balancing layer is not as high as we expected. On Monday 15 March 2021, about 8.4% of read-only GET
requests queried the primary database. The number slightly increases comparing to 7.4% in the previous week.
After debugging, a major of the requests to primary comes from the following endpoints:
Endpoint | Sum of db_primary_count | Sum of db_primary_duration_s |
---|---|---|
Projects::MergeRequestsController#index | 12,090,304 | 24,792.892 |
Projects::IssuesController#index | 7,438,682 | 20,488.362 |
In Projects::MergeRequestsController#index
, there is one call to Gitlab::IssueablesCountForState#perform_count
: https://gitlab.com/gitlab-org/gitlab/blob/e2dcd5e26a01d7d23685d6cc03592133dc1610f6/lib/gitlab/issuables_count_for_state.rb#L81. This method triggers this line of code:
ApplicationRecord.with_fast_statement_timeout do
finder.count_by_state
end
Surprisingly, all inside statements are wrapped inside a transaction
(0.2ms) BEGIN
SQL (0.2ms) SET LOCAL statement_timeout = 5000
User Load (2.6ms) SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT 1
(0.2ms) COMMIT
As a result, all the following queries of the request are redirected to the primary. This MR is to fix that issue
Solution
Right now we are treating transaction
as a sticky statement that force all the following statements to the primary DB. However, it is false negative in the scenario that all statements inside a transaction are all read-only. We already have user_primary
to force primary redirections. Adding use_replica_if_possible
seems reasonable.
def with_fast_read_read_statement_timeout
Gitlab::LoadBalancing.use_replica_if_possible do
transaction(requires_new: true) do
connection.exec_query("SET LOCAL statement_timeout = 5000")
yield
end
end
end
After this MR, we may be interested in forcing some other queries to replica as well.
Screenshots (strongly suggested)
Before
After
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team