Add URL scrubbing to trace secret masking
What does this MR do?
Adds URL scrubbing to trace mask transformers, removing the previous regex based implementation.
Why was this MR needed?
- It's much faster
- The previous implementation only worked on individual log lines, but not log fields, accidentially revealing secrets (#4625 (closed), !1541 (closed))
Old benchmarks
BenchmarkBuffer10k-16 261 47012861 ns/op 74.02 MB/s 9400 B/op 24 allocs/op
BenchmarkBuffer10kWithURLScrub-16 55 210216486 ns/op 16.55 MB/s 17768035 B/op 100055 allocs/op
New benchmarks
BenchmarkBuffer10k-16 196 56731874 ns/op 61.34 MB/s 13608 B/op 26 allocs/op
There's no need for a Buffer10kWithURLScrub
benchmark now, as it's all covered by the same process. BenchmarkBuffer10k
becomes slightly slower, but nowhere near as slow as it was previously, especially with a URL included.
What's the best way to test this MR?
Unit tests:
go test ./helpers/trace
Manual test:
A known problem this fixes is the leaking of information that would appear in log fields.
If you incorrect configure an S3 cache (with nonsense values):
[runners.cache]
Type = "s3"
Path = "path/to/prefix"
Shared = false
[runners.cache.s3]
ServerAddress = "s3.amazonaws.com"
AccessKey = "AWS_S3_ACCESS_KEY"
SecretKey = "AWS_S3_SECRET_KEY"
BucketName = "runners-cache"
BucketLocation = "eu-west-1"
Insecure = false
And run a job that uses the cache, the invalid settings will cause an error which unfortunetly leaks the secrets.
Before (secrets are exposed):
WARNING: Retrying... error=Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AWS_S3_ACCESS_KEY%2F20210423%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20210423T163411Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=4f1f745874d2c4715982ac35ff907864986e62e4929ffe81f76f4b8b6fdb9600": 301 response missing Location header
WARNING: Retrying... error=Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AWS_S3_ACCESS_KEY%2F20210423%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20210423T163411Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=4f1f745874d2c4715982ac35ff907864986e62e4929ffe81f76f4b8b6fdb9600": 301 response missing Location header
FATAL: Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AWS_S3_ACCESS_KEY%2F20210423%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20210423T163411Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=4f1f745874d2c4715982ac35ff907864986e62e4929ffe81f76f4b8b6fdb9600": 301 response missing Location header
After (values are now masked):
WARNING: Retrying... error=Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=[MASKED]&X-Amz-Date=20210423T155744Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=[MASKED] 301 response missing Location header
WARNING: Retrying... error=Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=[MASKED]&X-Amz-Date=20210423T155744Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=[MASKED] 301 response missing Location header
FATAL: Get "https://runners-cache.s3.dualstack.eu-west-1.amazonaws.com/path/to/prefix/runner/tx7Rk1qD/project/19958710/default?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=[MASKED]&X-Amz-Date=20210423T155744Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=[MASKED] 301 response missing Location header
What are the relevant issue numbers?
closes #4625 (closed)
Edited by Arran Walker