Run LLM evaluations concurrently
What does this MR do and why?
-
Updates the QA evaluation tests to run threaded network requests.
-
qa_epic_spec.rb
andqa_issue_spec.rb
are merged intoqa_evaluation_spec.rb
because they can be run together quickly now (the evaluation process takes about a minute to complete and 15 mins in total for the whole CI job.)
-
-
Introduces a QA test made up of a single, simple question that's fast and cheap to run.
- (CI configuration changes) Adds a new CI job that runs the QA test. The job is not optional and is not allowed to fail.
-
Improves the documentations in various places.
-
(CI configuration changes) Refactors
.gitlab/ci/rails.gitlab-ci.yml
and.gitlab/ci/rules.gitlab-ci.yml
for maintainability.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by euko