Project import of `gitlab_project` should use Workhorse acceleration: part I. Import via API.
Problem
Current implementation of the transfer of file from client to the server does not use Workhorse acceleration of uploads that performs direct upload to the object storage.
Given that export archives are usually big, very often in range of 100MB-5GB we very often hit Unicorn timeouts. Unicorn is configured to run request for as much as 60s, which very often is not enough to transfer the archive from the disk to remote storage.
Currently, the process is as follow:
- Request get accepted by Workhorse,
- Workhorse intercepts the request and streams the file to shared on-disk uploads storage,
- Workhorse rewrites the request body, to include a link to path on shared on-disk storage,
- Rails creates a new project,
- The default storage for all uploads is configured to object storage,
- Given that created file is stored on disk, the file is transfered in Rails to object storage,
- Given that this file is of substantial size, it might take even a few minutes.
Proposal
We have a well figured out direct-upload in Workhorse, which allows us to stream the file directly to Object Storage, without storing that on-disk.
Import feature should use direct-upload, similarly how it is being used by Artifacts upload and Packages upload.
Rails, would not have to transfer the file from disk to object storage, the only cheap operation that Rails would do is to move file from temporary location in OS, to persistent location in OS. We do that extensively for artifacts today, and this works great. Artifacts does easily handle artifacts of a few GBs if needed.