Provide the model metadata as part of the direct_access details
What does this MR do and why?
This adds the model metadata (model_provider
and model_name
) to the response in the /code_suggestions/direct_access
endpoint.
This is behind a feature flag use_codestral_for_code_completions
. We will only provide the model metadata when we have switched to using the codestral model for code completions. For now, while we are still using code-gecko, we will return an empty hash.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
Testing the endpoint directly
-
Enable the
use_codestral_for_code_completions
feature flag -
Send a request to the
/code_suggestions/direct_access
endpoint:ACCESS_TOKEN="<your-personal-access-token>" curl "http://gdk.test:3000/api/v4/code_suggestions/direct_access" \ -X POST \ --header "Authorization: Bearer $ACCESS_TOKEN" \ | json_pp -json_opt pretty,canonical
-
Verify that the response has the model_provider ('vertex-ai') and model_name ('codestral@2405')
{ "base_url" : "http://gdk.test:5052", "expires_at" : 1723117152, "headers" : {<headers here>}, "model_details" : { "model_name" : "codestral@2405", "model_provider" : "vertex-ai" }, "token" : "<token here>" }
End-to-end testing
-
Make sure that the GitLab Workflow extension for your IDE is pointed to your local GitLab instance (see setup instructions here: https://gitlab.com/gitlab-org/gitlab-vscode-extension#setup)
-
Make sure your AI Gateway has the changes in gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!1172 (merged)
-
Make sure your Rails Monolith has the changes in this MR, and that the
use_codestral_for_code_completions
feature flag is enabled -
Open a file in one of your repositories. Make sure that the file is in one of the supported languages
-
Start writing code and try to get a code completion suggestion
-
When you get a suggestions, verify that AI Gateway is using the correct provider ('vertex-ai') and model ('codestral@2405'). You can check this in your AI Gateway logs. Here is an example log (line breaks and comments added for clarity):
2024-08-09_09:59:35.04924 gitlab-ai-gateway : 2024-08-09 11:59:35 [info ] 172.16.123.1:57791 - "POST /v2/completions HTTP/1.1" 200 blocked=False client_ip=172.16.123.1 client_port=57791 correlation_id=f0f857cd54514ecc8ae3952354c4c8a9 cpu_s=0.09014600000000073 duration_request=-1 duration_s=2.5809748330211733 editor_lang=None gitlab_global_user_id=<gitlab_global_user_id> gitlab_host_name=gdk.test gitlab_instance_id=<gitlab_instance_id> gitlab_language_server_version=None gitlab_realm=self-managed gitlab_saas_duo_pro_namespace_ids=None gitlab_saas_namespace_ids=None gitlab_version=17.3.0 http_version=1.1 inference_duration_s=2.5353962079971097 lang=python meta.feature_category=code_suggestions method=POST # this is the relevant information model_engine=vertex-ai model_name=vertex_ai/codestral@2405 model_output_length=489 model_output_length_stripped=438 model_output_score=100000 path=/v2/completions prompt_length=6657 prompt_length_stripped=5127 status_code=200 url=http://gdk.test:5052/v2/completions user_agent=node-fetch/1.0 (+https://github.com/bitinn/node-fetch)
Related to #470171