Architecture simplification: stop running gitlab-pages as root
For various reasons gitlab-pages can be run as root or as an unprivileged process. In my understanding the reasons we have this are:
- a root pages process can open privileged TCP ports (80, 443)
- a root pages process can spawn an unprivileged child in a chroot as a defense-in-depth security measure
These capabilities come at a significant complexity cost.
- process supervision: the root parent supervises the unprivileged chid
- chroot managing: we need to make sure certain system resources are accessible in the chroot, we now do this with bind mounts
- passing configuration from the root parent to the unprivileged child: we serialize and deserialize configuration data internally so that e.g. TLS keys don't exist on disk in the chroot, but do exist in memory for the unprivileged child in the chroot
- passing network sockets as file descriptors from parent to child
This is all non-trivial stuff which slows down development. To make things worse, we also support a single-process mode where the main process is unprivileged and does the work itself. Most development and testing uses the single-process mode; you can only test the two-process mode when you run tests as root.
I think we can make things a lot simpler if we move to a different architecture:
- pages only runs unprivileged, as a single process
- use NGINX TCP reverse proxying to pass traffic on privileged ports into pages -- this is only needed for single-machine omnibus and source installs, clusters and cloud native have their own edge load balancers which can proxy TCP to gitlab-pages anyway
- increase test coverage, if needed, to guard against local file inclusion
At the time we developed the first version of gitlab-pages NGINX could not yet do TCP proxying but that is included in the standard NGINX on Ubuntu 16.04 now (the --with-stream
option listed in nginx -V
).
I think gitlab-pages will be easier to maintain and easer to contribute to if we make it "dumber" like this.