Skip to content

Clean up duo-chat eval prompt by removing examples

Hongtao Yang requested to merge clean-up-duo-chat-eval-prompt into main

What does this merge request do and why?

This MR cleans up the duo-chat eval prompt by removing all the examples, because we've seen some prompt contamination with the few-shot prompts: #543

This MR also upgrade the evaluating model to claude-3.5 from text-bison

Partially resolves #543

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.

Merge request reports

Loading