Adjust the mistral prompt for mixtral8x22b (MoE)
What does this MR do and why?
In this merge request, we adjust the Mistral prompt to be Mixtral8x22B (MoE) friendly and eliminate hallucinations.
During the testing of Mixtral8x22B, we encountered some hallucinations caused by the examples we provided.
The MoE was using these examples to generate responses.
For instance, we observed significant hallucinations where the output mentioned "Arkansas" for example:
After removing these examples and modifying the prompt to act more as a code generation agent rather than a code completion one.
Prompt Evaluation Control (~0.91) and Test similarity score (~0.89) previously was ~0.87
- Control : 0.91
- Current: 0.88769758238512009, Variance: -0.2
- Sample Size: 425
- Success: Yes, for a mistral family compared to Antrhopic, that should be good.
On GCP
SELECT avg(similarity_score) FROM `dev-ai-research-0e2f8974.code_suggestion_experiments.mhamda_mixtral_22b_20240603_150423__similarity_score` LIMIT 1000
Mistral Prompt Evaluation Control (~0.91) and Test similarity score (~0.86) previously was ~0.87
- Control : 0.91
- Before: 0.86543818249421944
- Current: 0.85687403566696974, Variance:
< -0.1
- Sample Size: 425
- Success: Yes, for a mistral family compared to Antrhopic, that should be good.
On GCP
SELECT avg(similarity_score) FROM `dev-ai-research-0e2f8974.code_suggestion_experiments.mhamda_mistral-2nd-run_20240603_154840__similarity_score` LIMIT 1000
We can definitely iterate on the prompt, and we do have an issue with that, but the prompt works for both mistral
and mixtral
.
Edited by Mohamed Hamda