Skip to content

Draft: Use Go Cloud for S3 transfers

Stan Hu requested to merge sh-aws-gocloud-support into main

The current mechanism of using pre-signed URLs is limited to 5 GB and slower than using a parallel, multipart upload mechanism that is implemented in the AWS SDK.

When IAM instance profile credentials are used, the temporary credentials are retrieved by the AWS SDK inside the cache-archiver.

When static credentials are used, these credentials (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) are passed along to GoCloud via environment variables. To avoid leaking these variables to the user:

  1. For the shell executor, these environment variables are injected into the environment only for the cache archiver/retrieval stages.
  2. For the Docker executor, these environment variables are passed the environment during container creation for the cache archiver/retrieval stages.
  3. For the Kubernetes executor, a separate cache-helper helper container is created with these credentials defined in the environment.

Note that users may also define AWS_ACCESS_KEY_ID as CI/CD variables in their projects. To ensure cache uploads won't be affected by these user-defined variables, we need to avoid exporting them in the shell script. For the cache-related stages, we do this by filtering out job variables that will conflict with the variables that are defined by the static credentials.

TODO

  • Write tests
  • Add feature flag for exporting env variables
  • Check behavior of previous ShouldUseIAMCredentials()
  • Verify that this works with shell executor but does not leak secrets
  • Verify that this works with K8S executor but does not leak secrets

Closes #26921 (closed)

Edited by Stan Hu

Merge request reports

Loading