Reduce Redis usage from merge request diffs caching
In https://gitlab.com/gitlab-com/infrastructure/issues/1631#note_30622014, we saw that 50% of our Redis usage was for merge_request_diffs
.
In https://gitlab.com/gitlab-com/infrastructure/issues/3840#note_62833833, @stanhu asked:
Do we need the application to evict keys for merge request diffs (e.g. if a merge request diff is updated, just invalidate the keys that this affects)?
I think we could definitely experiment with this. My concern is that it may be used for highlighting comments on the old version of the diff, in which case we would be exacerbating https://gitlab.com/gitlab-org/gitlab-ce/issues/43961 to save on Redis memory usage, which seems backwards to me.
Do we need to store this diff data somewhere else?
This is the highlighted diff data. This is already huge without highlighting (https://gitlab.com/gitlab-org/gitlab-ce/issues/37632#note_60965022), and I think storing it in Postgres is a worse idea. Is there anywhere else we could keep it?
How do we store this data more efficiently?
It's an HTML string, so perhaps we could do some clever things there, but otherwise, aren't we best off addressing the number of keys in the cache at any given time instead?