Use ThreadPoolExecutor in ShardedStorage to parallelize bulk operations to different shards
Before raising this MR, consider whether the following are required, and complete if so:
-
Unit tests - Covered by existing tests -
Metrics - N/A -
Documentation update(s) - Added option description to parser
Description
This PR aims to speed up ShardedStorage
by leveraging ThreadPoolExecutor
to enable parallel operations on each individual shard. Before each shard would be accessed sequentially which ended up being quite slow and ignores a big benefit of sharding. One ThreadPoolExecutor
is created for the entire storage and it's size can be set in the configuration. I opted to go this route instead of spinning up a smaller ThreadPoolExecutor
per request because the scaling is more predictable and tunable.
We already do have a lot of threads, but my hope is that storage operations should be lots of I/O so we will still get benefit from the additional threads.
As part of implementing this I added a small helper to context.py
to allow copying all the BuildGrid ContextVars into the sharded storage worker threads, which keeps the logging working as expected.