Skip to content

docs: GitLab AI Gateway Deployment Guide

David O'Regan requested to merge oregand-master-patch-04f6 into master

This merge request updates the documentation for installing the AI Gateway. It adds information about autoscaling requirements, provides configuration examples for different deployment sizes, and offers guidance on resource allocation and performance. The changes include recommendations for small, medium, and large deployments, detailing the number of concurrent requests each can handle. It also discusses resource specifications for the AI Gateway container, mitigation strategies for resource contention, and scaling recommendations. The update aims to help users better understand how to configure and scale the AI Gateway based on their specific needs and usage patterns.

Part of https://gitlab.com/gitlab-org/gitlab/-/issues/509814+

Edited by Sean Carroll

Merge request reports

Loading