Remove embeddings tests in monolith
Currently, we have some rspec tests in the gitlab monolith that test embeddings-related questions using the :ai_embedding_fixtures
rspec metadata tag.
We have some recent changes to embeddings/documentation questions:
- Our Chat evaluation framework includes documentation questions, so we can test documentation questions locally.
- We've moved to serving embeddings from the AI Gateway instead of the monolith, so the embeddings database, while still part of the monolith, is no longer a used part of the monolith.
Our current test strategy is to first seed the local gitlab
embeddings database with documentation embeddings and then extract relevant embeddings from that database into a fixture file.
Going forward, developers won't need to set up an embeddings database as part of the local gitlab instance because that is handled by the AI Gateway.
This means that updating the fixtures will become a pain because the whole embeddings DB will need to be set up locally just to do the extraction for tests. Also, our embeddings setup documentation will likely become out of date over time if we are not regularly using it.
Now that we have eval abilities for documentation questions and the docs are being served from the AI Gateway, should we just get rid of the embeddings-in-tests functionality? It is only used by chat_real_requests_spec
, which is what we use to test tool selection. But we can do tool selection tests using a python notebook or Langsmith instead.
If we are agreed on moving in the direction of removing these embeddings tests we should:
-
Update documentation to remove [this section]— Completed in Simplify AI docs (https://docs.gitlab.com/ee/development/ai_features/index.html#use-embeddings-in-specs) - Remove the rake tasks for seeding and extracting the embeddings database
- Remove the
chat_real_requests
questions that use embeddings