-
-
-
v0.18.2c5ef85ac · ·
Features: - feat: properly handle false positives in VR ([`143519f`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/143519fefb24c30b5884773d14b265561b9af55c)) - feat: update the vulnerability extractions to support the v7 dataset creation ([`0384335`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/0384335f3270a5b4109459ac18b4b184f623e6b6)) Chores: - chore(deps): update dependency gitlab-com/gl-infra/common-ci-tasks to v2.47.0 ([`1d418c3`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/1d418c309d81ede4208e12cffc26c369e335edff)) - chore: fix typos across code base ([`7386904`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/738690465a50bd0ce590f2c75c95120633e9bbb6)) - chore: refresh duo chat config example and doc ([`67b9d43`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/67b9d43b1ddb33303232fa1ad627270f03bf4389)) - chore(deps): update dependency gitlab-com/gl-infra/common-ci-tasks to v2.46.0 ([`6d27be6`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/6d27be6cc0bbb8faf03425fb01c815f5a25fe7c7))
-
v0.18.1171cfc0d · ·
Features: - feat: add work_items dataset for duo-chat ([`024dd3b`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/024dd3b3040c5d0b51d69f1c09465c19b4253561)) Fixes: - fix: update merge request daily run to use v2 dataset ([`c6ab84e`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/c6ab84eb13a7b5d276fe96b3ead70c4970998461)) Chores: - chore: clean up command descriptions ([`ee1ce63`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/ee1ce634c0a2cad179b148c3d161d704e6c52f4f)) - chore(deps): update dependency typer to ^0.13.0 ([`bf08c1c`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/bf08c1c27d80d1ea3baeeaf03a5ea3a521469b54)) - chore(deps): update dependency streamlit to v1.40.0 ([`60859e0`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/60859e0994ef2c238b607e349504639681513593)) - chore(deps): update dependency tokenizers to v0.20.3 ([`f02b397`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/f02b39703d699db46bad03ca10d025c916e9b71a)) - chore(deps): update dependency pydantic-xml to >=2.14.0,<2.15.0 ([`655ac21`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/655ac21da50ae74fea1162e25191775c4080d7df)) - chore(deps): update dependency langchain to v0.3.7 ([`4681b87`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/4681b87dac96e62a6c785668958bf69e8aae6275)) - chore(deps): update dependency gitlab-com/gl-infra/common-ci-tasks to v2.45.1 ([`8ba0902`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/8ba090290b26e148d445273367b85feee57d2259))
-
v0.18.01957efe7 · ·
Features - feat: move vulnerability extraction to Prompt Library ([`00ebde2`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/00ebde27760ede05efc3ffffde77b4f4b8854bd2)) Fixes - fix: gcloud CLI ([`273e600`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/273e60004e0d5a2fb844ba2194aa8e8d47c9f47f)) Refactoring - refactor: add resource_id to the issue/epic dataset ([`a5867f3`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/a5867f3eae4d06178b237dd558cd3ef533355ca9)) - refactor: remove rca stuff from duo-chat ([`c403451`](https://github.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/commit/c403451ac6a6e27f1817084e7a4495744a4af091))
-
v0.17.00f13491c · ·
Version 0.17.0 Features - Add functional test to vr daily run - Fix delete MR script to clean-up MR from VR - Upgrade to latest claude 3.5 sonnet Fixes - Add pagination in delete mr script - Bypass llm judge on failed answers - Disable auto-scaling for VR daily run Internal - Deprecate Claude 3 Sonnet comparison in RCA - Reduce RCA daily run batch size - Refactor: delete mr script - Update dependency aiohttp to v3.10.10 - Update dependency black to v24.10.0 - Update dependency evilmartians/lefthook to v1.7.22 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.41.1 - Update dependency google-cloud-aiplatform to v1.70.0 - Update dependency langchain to v0.3.4 - Update dependency poetry to v1.8.4 - Update dependency pydantic-xml to >=2.13.1,<2.14.0 - Update dependency python-gitlab to v4.13.0 - Update dependency tokenizers to v0.20.1
-
v0.16.08a2d8883 · ·
Features - feat(duo-chat): new judge for documentation dataset - feat(vulnerability-resolution): added functional tests to detect if any new vulnerabilities are introduced Refactoring - refactor: encapsulate judge for duo-chat Documentation - doc: document application limits and service accounts
-
v0.14.244f9bae6 · ·
Version 0.14.2 Internal - Update apache/beam_python3.12_sdk Docker tag to v2.59.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.36.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.37.0 - Update dependency google-cloud-aiplatform to v1.66.0 - Update dependency langchain to ^0.3.0 - Update dependency pytest to v8.3.3 - Update docker Docker tag to v27.2.1
-
v0.14.1232a6a53 · ·
Version 0.14.1 Fixes - Fix lint command for git push hook Internal - Update dependency evilmartians/lefthook to v1.7.15 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.35.1 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.35.2 - Update dependency google-cloud-aiplatform to v1.65.0 - Update dependency langchain to v0.2.16 - Update dependency pydantic to v2.9.1 - Update dependency python to v3.12.6
-
v0.14.03e9c88da · ·
Version 0.14.0 Features - Add support for custom query - Added the mock scanner evaluation for ETV Fixes - Fix async call - Fix asyncio fixture deprecation warnings - Use vulnerability description in the LLM judge prompt Internal - Add more info into the TestCase object - Lock file maintenance - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.34.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.34.1 - Update docker Docker tag to v27.2.0 - VR judge improvement
-
v0.12.33533eb5b · ·
Version 0.12.3 Features - Add dry-run option to all use cases Fixes - Fix csv name - Fix output table name - Increase GraphQL retry wait time - Move GITLAB_TOKEN fetching out of the pipeline - Use the full vulnerability response object as the model response Internal: - Lock file maintenance - Fix failed vale dependency name exclusion - Update dependency aiohttp to v3.10.5 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.27.1 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.28.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.29.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.30.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.30.1 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.31.0 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.32.0 - Update dependency mypy to v1.11.2 - Update dependency pydantic-xml to >=2.12.1,<2.13.0 - Update dependency pytest-asyncio to ^0.24.0 - Update dependency typer to v0.12.5
-
v0.11.1bef0f733 · ·
Version 0.11.1 Internal - Enhance the Workflow data scraping script - Extract write_results to common - Rename rca to root_cause_analysis - Update dependency aiohttp to v3.10.4 - Update dependency apache-beam to v2.58.1 - Update dependency db-dtypes to v1.3.0 - Update dependency evilmartians/lefthook to v1.7.13 - Update dependency evilmartians/lefthook to v1.7.14 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.27.0 - Update dependency google-cloud-aiplatform to v1.62.0 - Update dependency langchain to v0.2.14 - Update dependency typer to v0.12.4 - Update docker Docker tag to v27.1.2
-
v0.11.024919521 · ·
Version 0.11.0 Features - Add Claude 3.5 Opus research model to the model map - Add RCA daily run staging partitions - Code review pipelines - Make chunking methods configurable Internal - Lock file maintenance - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.25.3 - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.26.0
-
v0.10.8bd3ab84e · ·
Version 0.10.8 Fixes - Do not use LLM judge on failed answers Internal - Update dependency google-cloud-aiplatform to v1.61.0 - Update dependency aiohttp to v3.10.3 - Update dependency python to v3.12.5 - Update dependency tokenizers to ^0.20.0 - Update dependency flake8 to v7.1.1 - Update dependency evilmartians/lefthook to v1.7.12
-
v0.10.7fdcb46d7 · ·
Version 0.10.7 Fixes - Remove precomputed answers in issue_epic daily run - Tweak the judge's prompt to better format response Internal - Log all GraphQL requests - Update apache/beam_python3.12_sdk Docker tag to v2.58.0 - Cleanup for Duo Chat staging CEF - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.25.2 - Update Duo Chat eval to run on staging - Update dependency gitlab-com/gl-infra/common-ci-tasks to v2.25.1