Skip to content

perf: set 30 sec to keep-alive expiry for anthropic clients

Shinya Maeda requested to merge perf-anthropic-keep-alive into main

What does this merge request do and why?

We're currently using the default 5 sec of the keep-alive expiry period in Anthropic clients. This is too short that it quickly expires the connection and requires TLS handshake on every request.

This MR extends the expiry to 30 sec, so that we can reduce the latency in total.

How to set up and validate locally

  1. Launch server poetry run ai_gateway.
  2. Request via curl:
curl -X 'POST' \
  'http://0.0.0.0:5052/v1/chat/agent' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt_components": [
    {
      "type": "string",
      "metadata": {
        "source": "string",
        "version": "string"
      },
      "payload": {
        "content": "\n\nHuman: Hi, How are you?\n\nAssistant:",
        "provider": "anthropic",
        "model": "claude-2.1",
        "params": {
          "stop_sequences": [
            "\n\nHuman",
            "Observation:"
          ],
          "temperature": 0.2,
          "max_tokens_to_sample": 2048
        },
        "model_endpoint": "string",
        "model_api_key": "string"
      }
    }
  ],
  "stream": false
}'
  1. Repeat the requests multiple times within 30 secs and examine ai-gateway.log that the TLS handshake happens only once:
{"correlation_id": "c8082d100e52489f9880e0aee57a61c9", "logger": "httpcore.connection", "level": "debug", "type": "mlops", "stage": "main", "timestamp": "2024-08-08T06:10:19.176131Z", "message": "start_tls.started ssl_context=<ssl.SSLContext object at 0x795c0ff5bb40> server_hostname='api.anthropic.com' timeout=5.0"}
{"correlation_id": "c8082d100e52489f9880e0aee57a61c9", "logger": "httpcore.connection", "level": "debug", "type": "mlops", "stage": "main", "timestamp": "2024-08-08T06:10:19.189728Z", "message": "start_tls.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x795c0f22e620>"}
  1. Wait for more than 30 secs and request again. Confirm that the TLS handshake happens again.

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Shinya Maeda

Merge request reports

Loading