-
Pagination added to the following LangSmith dataset 0 of 2 checklist items completed
-
Add missing SWE Bench datasets to LangSmith 0 of 2 checklist items completed
- Merged
- 3
- Approved
updated -
fix: restore __init__.py in the duochat package 0 of 2 checklist items completed
- Merged
- 1
- 1
- Approved
updated -
Run SWE benchmark as part of ELI5 0 of 2 checklist items completed
- Merged
- 21
- 1
- Approved
updated -
feat: add direct routes and server side metrics for Fireworks provider 0 of 3 checklist items completed
-
ci: schedule the langsmith:pull CI job 0 of 2 checklist items completed
- Merged
- 2
- Approved
updated -
Pull LangSmith datasets as part of the CI job 0 of 2 checklist items completed
- Merged
- Approved
updated -
Adding PUSH from ELI5 Datasets to LangSmith in the Form of CREATE with Splits 1 of 2 checklist items completed
- Merged
- 19
- Approved
updated -
chore: update documentation local duo chat evaluation 0 of 2 checklist items completed
- Merged
- 6
- Approved
updated -
Infer dataset schema and schema mismatch when pulling all datasets 0 of 2 checklist items completed
- Merged
- 4
- Approved
updated -
Pull examples from LangSmith 0 of 2 checklist items completed
- Merged
- 4
- 1
- Approved
updated -
feat: make pairwise evaluation feature-specific with pre-defined evaluators 0 of 2 checklist items completed
- Merged
- 25
- Approved
updated -
feat: wrap embedding semantic evaluators into generic classes 1 of 2 checklist items completed
- Merged
- 15
- Approved
updated -
feat: introduce base evaluators to build custom ones 1 of 2 checklist items completed
- Merged
- 3
- Approved
updated -
Update the fix-broken-pipeline evaluation for Duo Workflow 1 of 2 checklist items completed
- Merged
- 23
- Approved
updated -
Evaluate Duo Chat on an issue/epic-related QA dataset 0 of 2 checklist items completed
- Merged
- 15
- Approved
updated -
Add duo chat dataset with issue and epic resources 0 of 2 checklist items completed
- Merged
- 2
- Approved
updated -
Use DatasetRegistry to register predefined dataset schemas 0 of 2 checklist items completed
- Merged
- 1
- Approved
updated -
Evaluate Duo Chat on a document-related QA dataset 0 of 2 checklist items completed
- Merged
- 25
- Approved
updated -
Add pairwise evaluator 1 of 2 checklist items completed
- Merged
- 3
- Approved
updated