Add a global file size limit for pushes - log mode
Problem
Currently, we have the ability for project administrators to set a limit for how large files can be that get pushed to the repository. However, we don't have the ability for an administrator to set a hard global limit.
As added context, GitHub sets a hard limit across their site at 100mb
In allowing unlimited file sizes, we expose ourselves to risk on two main fronts:
1. Scale & Performance
Git was not designed to handle huge files. This poses a performances and scalability risk.
- clones will become slow
- clones can consume large amounts of network bandwidth (cost)
- repository maintenance slows down considerably with large repositories
In fact, recently we underwent a disk saturation event, whose cause was in part due to large files being pushed to a repository.
2. Cost
Having large files on the repository ends up being a sink on storage cost.
Proposal
- Add the ability for administrators to set file size limits per tier for git pushes
- Make this controllable via plan limits
or UI field - For files beyond this size, users will need to use Git LFS, which is already supported in Gitlab.
- Make this controllable via plan limits
- The limit should only apply to new files
- Updates to a file can be over the limit when the file was already over the limit before the update; the update will not be rejected
- Repositories that contain files that would be over the limit in a free tier should continue to be successfully forkable, mirrorable, and importable into a free tier project
- In forking scenarios, updates to a file can be over the limit when the file was already over the limit in the parent project; the update will not be rejected
- In forking scenarios, updates to a file can be over the limit when the file is under the higher tier limit in the parent project but over the lower tier limit in the fork; the update will not be rejected
- Updates to files in mirrors or imported projects, should behave analogous to the two forking scenarios just above.
- These cases must be tested to ensure we are not braking the forking, mirroring, and import experiences.
- Have feature flag to switch between log mode and prevent mode. So that we can first
log
what we would prevented and investigate if we missed something. And then turn onprevent
mode to actually prevent it. And if we still find issues can easily and quickly returnlog
mode.- When "prevent" mode is turned on reject Import and Pushing new files when the file is above the size with an error message to the user.
- Observability around rejected files, so we keep track of what we are preventing.
- The feature must be documented so administrators know how to use it.
Pros
- cost savings in storage
- cost savings in network bandwidth
- more performant repositories
- better repository maintenance
- relatively simple implementation, since we already have this mechanism under the hood. It would involve adding a database column for this global hard limit, and changes to Rails code to enforce the limit based on it on the
/allowed
internal endpoint.
Cons
The main downside is that users and customers have grown accustomed to unlimited file size. This may be inconvenient for them -- but, it can be argued that these are the very users that would benefit from such limits as git performance suffers from large files.
Availability and testing
Feature spec to be added to configure upper file size limit as an Administrator. Exploratory testing to ensure existing repositories that are over the limit are not impacted.