Stop error level logging on deleting cgroup file
What
Instead of returning error of files in process directory, log the path on an info level.
Why
In production, we are seeing logs below that end up being noise and not really useful:
{"error":"gitaly process directory contains an unexpected file","level":"error","msg":"could not prune entry","path":"/sys/fs/cgroup/cpu/gitaly/cpuacct.usage_percpu_user","time":"2022-09-22T22:03:57.874Z"}
{"error":"gitaly process directory contains an unexpected file","level":"error","msg":"could not prune entry","path":"/sys/fs/cgroup/cpu/gitaly/cpuacct.usage_sys","time":"2022-09-22T22:03:57.874Z"}
{"error":"gitaly process directory contains an unexpected file","level":"error","msg":"could not prune entry","path":"/sys/fs/cgroup/cpu/gitaly/cpuacct.usage_user","time":"2022-09-22T22:03:57.874Z"}
{"error":"gitaly process directory contains an unexpected file","level":"error","msg":"could not prune entry","path":"/sys/fs/cgroup/cpu/gitaly/notify_on_release","time":"2022-09-22T22:03:57.874Z"}
{"error":"gitaly process directory contains an unexpected file","level":"error","msg":"could not prune entry","path":"/sys/fs/cgroup/cpu/gitaly/tasks","time":"2022-09-22T22:03:57.874Z"}
PruneOldGitalyProcessDirectories
function is used in 2 different contexts;
- Delete the cgroup directory under the
hierarchy_root
for example/gitaly
. Inside the cgroup directory there is always files likecpu.stat
which can't be deleted. - Delete the runtime directory where if there are files in that directory we ignore.
In
!4891 (comment 1117227167)
we discussed if moving the cgroup deletion into a seperate method but
that would be a lot of code duplication, instead change the log level to
be info
rather then error
There error handling is only to log the error message and we don't have
any other code path that we execute,
Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16270
Edited by Steve Xuereb