Skip to content

Implement probes for checking self-hosted AIGW installation

What does this MR do and why?

Why?

For #491051 (closed)

This MR adds probes to perform multiple different checks on an air-gapped instances.

An instance is assumed to be air-gapped if they have an offline cloud license (as opposed to offline cloud license). See docs on licensing here for details.

So, air-gapped instance, by definition, cannot contact the outside world, and hence cannot connect to GitLab hosted AI Gateway (via CloudConnector) too.

Such instances will resort to self-hosting their own AI Gateway, and for Custom Models GA, we are serving air-gapped instances first. And so the probes also need to be different from the default ones we are currently executing.

For an air-gapped instance serving their own self-hosted AI Gateway, we will have to make 4 checks:

Check Uses probe
Is the environment variable ENV['AI_GATEWAY_URL'] defined and present? SelfHosted::AiGatewayUrlPresenceProbe
Is the URL defined by ENV['AI_GATEWAY_URL'] reachable? HostProbe
Does the instance have a valid license to access code suggestions? SelfHosted::CodeSuggestionsLicenseProbe
Does code completion work? (we test this by passing a piece of code to the configured model via the self-hosted AIGW) EndToEndProbe

SelfHosted::AiGatewayUrlPresenceProbe and SelfHosted::CodeSuggestionsLicenseProbe are new probes and are introduced in this MR.

Results

The results of running the probes are already exposed via the GraphQL endpoint, https://docs.gitlab.com/ee/api/graphql/reference/#querycloudconnectorstatus

After this change, on an instance that is air-gapped (ie, one which uses an offline cloud license), it gives the following output:

{
    "data": {
        "cloudConnectorStatus": {
            "success": true,
            "probeResults": [
                {
                    "name": "ai_gateway_url_presence_probe",
                    "success": true,
                    "message": "Environment variable AI_GATEWAY_URL is set to http://gdk.test:5052.",
                    "__typename": "CloudConnectorProbeResult"
                },
                {
                    "name": "host_probe",
                    "success": true,
                    "message": "gdk.test reachable.",
                    "__typename": "CloudConnectorProbeResult"
                },
                {
                    "name": "code_suggestions_license_probe",
                    "success": true,
                    "message": "License is valid to access code suggestions.",
                    "__typename": "CloudConnectorProbeResult"
                },
                {
                    "name": "end_to_end_probe",
                    "success": true,
                    "message": "Authentication with AI Gateway services succeeded.",
                    "__typename": "CloudConnectorProbeResult"
                }
            ],
            "__typename": "CloudConnectorStatus"
        }
    }
}

Next

The UI in admin/code_suggestions should use the results from the new probes and display it in the Health Check component. This will be done in Display results of probes (#491564 - closed)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Related to #491051 (closed)

Edited by Manoj M J

Merge request reports

Loading