-
v0.8.06c7032ee · ·
Version 0.8.0 Features - Add troubleshoot slash command Fixes - Increase timeout to 30 seconds with graphql calls Internal - Update dependency pytest to v8.3.1 - Update dependency evilmartians/lefthook to v1.7.4 - Update dependency google-cloud-aiplatform to v1.59.0 - Update dependency evilmartians/lefthook to v1.7.2 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.20.4
-
-
v0.6.7c14129f3 · ·
Version 0.6.7 Features - Add more ETV daily run partitions - Add ETV and RCA daily run configs - Add remaining metrics to the ETV pipeline - Added dataset versions Fixes - Add openai token as CI variable - Better judge for ETV evaluation - Silence verbose warning Internal - Update dependency pydantic to v2.8.2 - Update dependency google-cloud-aiplatform to v1.58.0 - Update docker Docker tag to v27.0.3 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.18.0 - Update dependency tenacity to v8.5.0 - Update dependency langchain to v0.2.6 - Update dependency mypy to v1.10.1 Docs - Added documentation on how to create new datasets
-
v0.6.30464aa78 · ·
Version 0.6.3 - Added configs to generate answers for foundational models - Better error handling for LLM requests - Added RCA and ETV models to call GraphQL endpoints
-
-
-
v0.4.091a0c757 · ·
Version 0.4.0. - Removes `input_adapter`. Break existing configs and requires new config format. - Introduces new code-creation pipeline - Added mixtral
-
0.3.164563405 · ·
New Features: - Support output results to local files - New config field (`output_sinks`) to write results to multiple locations - Added ability to compare with ground truth with code generation datasets Optimization: - Reduce the number of text-embedding API calls by 50% in similarity score metric Misc: - Added test coverage report in CI - Better logging - Upgrade python to 3.11.8
-
-
0.2.0475a31dc · ·
- Various bug fixes. - Introduce a new config format where the user can specify one or more metrics. - Introduce a new metric "Collective LLM Judge" - Add a new field to the output table to show the timestamp of the pipeline run.