Skip to content

Support etag for MR discussions to cache per VU

Nailia Iskhakova (OOO) requested to merge 534-cache-per-vu into main

Adding ability to cache etag for MR discussions endpoint to emulate web browsers behaviour and resolve #534.

With this change each VU on the first iteration, will save etag value for each VU and reuse it for the rest of the calls.

Note disabled_mr_discussions_redis_cache FF was enabled on .com and it didn't affect performance gitlab-org/gitlab#367098 (comment 1067161277)

How etag works in GitLab

#524 (comment 1108446379)

An example scenario:

  1. User views first page of discussions (discussions.json?per_page=20).
  2. User views second page of discussions (discussions.json?per_page=30).

When user views the first page, it includes etag header in the response header. When user views the first page again (reload for example), the etag from the first first page request should be passed as If-None-Match on the second first page request. This is to tell the backend to serve the same page with the same etag.

The etag from the first page request shouldn't be passed to the second page request as they're different pages.

The main difference is that Redis caching is persisted, HTTP caching is tied to client (browser cache). If a GPT test uses the same user and the Redis cache isn't flushed, then the next visit will still be cached since the cache is still in redis.

Browsers do this automatically that's why you can see the If-None-Match headers in the request headers. It gets the etag of the request with the same URL as If-None-Match for the request of the same URL again.

Screenshot_2023-03-03_at_16.02.22

Etag

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag

The "ETag" (entity tag) is a header value in the HTTP protocol that is used to identify a particular version of a resource. It is typically generated by the server and sent to the client along with the resource, and the client can later send the ETag back to the server to check if the resource has been updated since it was last retrieved.

They are unique to each version of the resource being requested. For example, if two different users request the same resource at different times and the resource has not been modified in between, they will receive the same ETag value for that resource.

However, if the resource is modified between the two requests, the server will generate a new ETag value for the updated version of the resource, and the two users will receive different ETag values for that resource.

When a user requests a resource for the first time, the server will typically generate a new ETag value for that resource. This is because the server has not sent that resource to the user before, so there cannot be any previously generated ETag value for the user to cache.

In general, ETag values are generated based on the content of the resource. The server may use a hash function or some other algorithm to generate a unique identifier for the content of the resource, and that identifier becomes the ETag value. When a user requests the resource again, the server can compare the ETag value in the user's request with the current ETag value for the resource. If the ETag values match, the server can send a 304 Not Modified response, indicating that the resource has not been updated since the user last requested it.

It's worth noting that some servers may use other caching mechanisms, such as the "Last-Modified" header, in addition to or instead of ETags. In these cases, the server would not need to generate a new ETag value for each request, since it can use the Last-Modified value to determine if the resource has been updated since the user last requested it. However, ETags are still commonly used as a caching mechanism, and they can provide more precise control over caching than Last-Modified, especially for resources that are generated dynamically.

Closes #534

Edited by Nailia Iskhakova (OOO)

Merge request reports

Loading