-
fix: evaluate main patch and test patch separately 0 of 2 checklist items completed
- Merged
- 9
- Approved
updated -
Run SWE benchmark as part of ELI5 0 of 2 checklist items completed
- Merged
- 21
- 1
- Approved
updated -
Add missing SWE Bench datasets to LangSmith 0 of 2 checklist items completed
- Merged
- 3
- Approved
updated -
fix: restore __init__.py in the duochat package 0 of 2 checklist items completed
- Merged
- 1
- 1
- Approved
updated -
chore: update documentation local duo chat evaluation 0 of 2 checklist items completed
- Merged
- 6
- Approved
updated -
feat: json output format when --no-upload is selected 0 of 2 checklist items completed
- Merged
- 36
- Approved
updated -
Adding PUSH from ELI5 Datasets to LangSmith in the Form of CREATE with Splits 1 of 2 checklist items completed
- Merged
- 19
- Approved
updated -
Use group handler to manage maintainers 0 of 2 checklist items completed
- Merged
- 6
- 1
- Approved
updated -
ci: schedule the langsmith:pull CI job 0 of 2 checklist items completed
- Merged
- 2
- Approved
updated -
Pull examples from LangSmith 0 of 2 checklist items completed
- Merged
- 4
- 1
- Approved
updated -
Infer dataset schema and schema mismatch when pulling all datasets 0 of 2 checklist items completed
- Merged
- 4
- Approved
updated -
Pull LangSmith datasets as part of the CI job 0 of 2 checklist items completed
- Merged
- Approved
updated -
feat: Move Dataset Repository into ELI5 2 of 2 checklist items completed
- Merged
- 5
- Approved
updated -
fix: update google-cloud-aiplatform to get rid of a verbose warning 0 of 2 checklist items completed
- Merged
- 3
- Approved
updated -
feat: make pairwise evaluation feature-specific with pre-defined evaluators 0 of 2 checklist items completed
- Merged
- 25
- Approved
updated -
Evaluate Duo Chat on a document-related QA dataset 0 of 2 checklist items completed
- Merged
- 25
- Approved
updated -
fix: install command in MR template 0 of 2 checklist items completed
- Merged
- 1
- 1
- Approved
updated -
Update google-crc32c to 1.6.0 0 of 2 checklist items completed
- Merged
- 2
- Approved
updated -
feat: wrap embedding semantic evaluators into generic classes 1 of 2 checklist items completed
- Merged
- 15
- Approved
updated -
Run lint job on code changes on merge request 0 of 2 checklist items completed
- Merged
- 2
- Approved
updated