Duo Chat: Change prompt to ignore docs when not useful (!149731) · Merge requests · GitLab.org / GitLab

Bruno Cardoso requested to merge bc/duo-chat-investigate-codegen-and-docs into master Apr 16, 2024

What does this MR do and why?

This MR improves the performance on code generation questions that happened to have the GitlabDocumentation tool selected when answering them. The change in prompt is to allow the LLM to ignore the content retrieved from gitlab docs and thus let DirectAnswer try to answer it.

Follow-up on gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/ai-experiments#17 (comment 1862475183).

Results on the sampled code generation dataset

current with claude-3	with fix

How to set up and validate locally

Config file used:

{
  "beam_config": {
    "pipeline_options": {
      "runner": "DirectRunner",
      "project": "dev-ai-research-0e2f8974",
      "region": "us-central1",
      "temp_location": "gs://prompt-library/tmp/",
      "save_main_session": false
    }
  },
  "input_bq_table": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1",
  "output_sinks": [
    {
      "type": "local",
      "path": "data/output_claude3_mbpp_fix_codegen",
      "prefix": "experiment"
    }
  ],
  "throttle_sec": 1,
  "batch_size": 10,
  "input_adapter": "mbpp",
  "eval_setup": {
    "answering_models": [
      {
        "name": "duo-chat",
        "parameters": {
          "base_url": "http://127.0.0.1:3000"
        },
        "prompt_template_config": {
          "templates": [
            {
              "name": "empty",
              "template_path": "data/prompts/duo_chat/answering/empty.txt.example"
            }
          ]
        }
      },
      {
        "name": "human",
        "prompt_template_config": {
          "templates": [
            {
              "name": "human",
              "template_path": "data/prompts/duo_chat/answering/empty.txt.example"
            }
          ]
        }
      }
    ],
    "metrics": [
      {
        "metric": "similarity_score"
      }
    ]
  }
}

Refer to https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/blob/main/doc/how-to/run_duo_chat_eval.md.

Edited Apr 16, 2024 by Bruno Cardoso

Duo Chat: Change prompt to ignore docs when not useful

What does this MR do and why?

Results on the sampled code generation dataset

How to set up and validate locally

Merge request reports