Duo Chat: Change prompt to ignore docs when not useful
What does this MR do and why?
This MR improves the performance on code generation questions that happened to have the GitlabDocumentation
tool selected when answering them. The change in prompt is to allow the LLM to ignore the content retrieved from gitlab docs and thus let DirectAnswer try to answer it.
Follow-up on gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/ai-experiments#17 (comment 1862475183).
Results on the sampled code generation dataset
current with claude-3 | with fix |
---|---|
How to set up and validate locally
Config file used:
{
"beam_config": {
"pipeline_options": {
"runner": "DirectRunner",
"project": "dev-ai-research-0e2f8974",
"region": "us-central1",
"temp_location": "gs://prompt-library/tmp/",
"save_main_session": false
}
},
"input_bq_table": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1",
"output_sinks": [
{
"type": "local",
"path": "data/output_claude3_mbpp_fix_codegen",
"prefix": "experiment"
}
],
"throttle_sec": 1,
"batch_size": 10,
"input_adapter": "mbpp",
"eval_setup": {
"answering_models": [
{
"name": "duo-chat",
"parameters": {
"base_url": "http://127.0.0.1:3000"
},
"prompt_template_config": {
"templates": [
{
"name": "empty",
"template_path": "data/prompts/duo_chat/answering/empty.txt.example"
}
]
}
},
{
"name": "human",
"prompt_template_config": {
"templates": [
{
"name": "human",
"template_path": "data/prompts/duo_chat/answering/empty.txt.example"
}
]
}
}
],
"metrics": [
{
"metric": "similarity_score"
}
]
}
}
Edited by Bruno Cardoso