Draft: Use Go Cloud for S3 transfers
The current mechanism of using pre-signed URLs is limited to 5 GB and slower than using a parallel, multipart upload mechanism that is implemented in the AWS SDK.
When IAM instance profile credentials are used, the temporary credentials are retrieved by the AWS SDK inside the cache-archiver.
When static credentials are used, these credentials (AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
) are passed along to GoCloud via environment variables. To avoid leaking these variables to the user:
- For the shell executor, these environment variables are injected into the environment only for the cache archiver/retrieval stages.
- For the Docker executor, these environment variables are passed the environment during container creation for the cache archiver/retrieval stages.
- For the Kubernetes executor, a separate
cache-helper
helper container is created with these credentials defined in the environment.
Note that users may also define AWS_ACCESS_KEY_ID
as CI/CD variables in their projects. To ensure cache uploads won't be affected by these user-defined variables, we need to avoid exporting them in the shell script. For the cache-related stages, we do this by filtering out job variables that will conflict with the variables that are defined by the static credentials.
TODO
-
Write tests -
Add feature flag for exporting env variables -
Check behavior of previous ShouldUseIAMCredentials()
-
Verify that this works with shell executor but does not leak secrets -
Verify that this works with K8S executor but does not leak secrets
Closes #26921 (closed)