Clean up duo-chat eval prompt by removing examples (!826) · Merge requests · GitLab.org / ModelOps / AI Model Validation and Research / AI Evaluation / Prompt Library · GitLab

Hongtao Yang requested to merge clean-up-duo-chat-eval-prompt into main Oct 24, 2024

What does this merge request do and why?

This MR cleans up the duo-chat eval prompt by removing all the examples, because we've seen some prompt contamination with the few-shot prompts: #543

This MR also upgrade the evaluating model to claude-3.5 from text-bison

Partially resolves #543

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Merge request checklist

I've ran the affected pipeline(s) to validate that nothing is broken.
Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.