feat: support model_provider argument in code suggestions evaluation
What does this merge request do and why?
- This adds a
model_provider
argument in the code suggestions evaluation that will be used for the AI Gateway client/source - This also changes the default
intent
value tocompletion
. I think we will be mostly testingcompletion
(especially with evaluations against AI Gateway).
How to set up and validate locally
Run the following:
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=ai-gateway \
--experiment-prefix=aigw-codegecko \
--model-name=code-gecko@002 \
--model-provider=vertex-ai \
--rate-limit=29 \
--limit=1 \
--evaluate-with-llm
The evaluation should run successfully.
In your AIGW logs, you should see a request coming in from vertex-ai/code-gecko (line breaks added for readability and clarity):
2024-08-21_06:25:59.76118 gitlab-ai-gateway : 2024-08-21 14:25:59 [info ] 172.16.123.1:49819 -
"POST /v2/code/completions HTTP/1.1" 200 blocked=False client_ip=172.16.123.1 client_port=49819
correlation_id=3d3f475d27f7413f80b568da2c0b7ec5 cpu_s=0.010920999999996184
duration_request=-1 duration_s=0.6571560000011232
editor_lang=None experiments=[{'name': 'exp_truncate_suffix', 'variant': 0}]
gitlab_duo_seat_count=None gitlab_global_user_id=None gitlab_host_name=None gitlab_instance_id=None
gitlab_language_server_version=None gitlab_realm=None gitlab_saas_duo_pro_namespace_ids=None gitlab_saas_namespace_ids=None
gitlab_version=None http_version=1.1 inference_duration_s=0.6488190839954768 lang=php
# these are the relevant details to watch out for
# model-engine is equal to the model_provider
meta.feature_category=code_suggestions method=POST
model_engine=vertex-ai model_name=code-gecko@002
model_output_length=3 model_output_length_stripped=3
model_output_score=-1.942774772644043
path=/v2/code/completions
# other details
post_processing_duration_s=0.0006997079981374554
prompt_length=110 prompt_length_stripped=94 prompt_symbols={}
status_code=200 suffix_length=0
url=http://gdk.test:5052/v2/code/completions
user_agent=python-requests/2.32.3
Here is an example result of the above evaluation command: click to see the Langsmith experiment result.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Pam Artiaga