Improve VertexAI embeddings creation performance
What does this MR do and why?
Improve VertexAI embeddings creation performance
This MR changes the flow of the embeddings creation as follows:
- md5sum on the file content is computed and stored as metadata on DB embedding record.
- CreateEmptyEmbeddingsRecordsWorker now checks if the md5sum for contents of a file changed before re-creating the embeddings for that file
- CreateEmptyEmbeddingsRecordsWorker now also checks if the file content is embeddable. I've added this because there are over a hundred files where content is just
This document was moved to [another location]
, so computing embeddings for these types of files make no sense and it will save us quite some API calls as well. - If file contents are different CreateEmptyEmbeddingsRecordsWorker will enqueue a CreateDbEmbeddingsPerDocFileWorker per file and new version, which will allow deduplication of jobs to take place for CreateDbEmbeddingsPerDocFileWorker in case CreateEmptyEmbeddingsRecordsWorker failed and is being rerun.
- CreateDbEmbeddingsPerDocFileWorker on re-run checks for any existing embeddings for the given file at given version and cleans-up those first before creating new embedding records, which prevents duplications.
- SetEmbeddingsOnTheRecordWorker - besides updating the actual embeddings, this will also cleanup old embeddings and replace them with new ones.
- We no longer keep multiple versions of the embeddings except for the short period while embeddings for a single file are built.
- We also no longer bump the current version, this however means we'll need to move off versioning the embeddings to marking/flagging them as
current
instead.
Related issues
- https://gitlab.com/gitlab-org/gitlab/-/issues/424010+
- https://gitlab.com/gitlab-org/gitlab/-/issues/424016+
Screenshots or screen recordings
- To test this out you'd need to enable following FFs that are being checked in the the workers:
return unless Feature.enabled?(:openai_experimentation) # this is legacy global AI toggle FF return unless Feature.enabled?(:gitlab_duo) # chat specific FF return unless Feature.enabled?(:create_embeddings_with_vertex_ai) # file_embeddings supported by vertex FF return unless ::License.feature_available?(:ai_chat) # license check
- Make sure vertexAI API key is configured
- And then in rails console you can just run
::Llm::Embedding::GitlabDocumentation::CreateEmptyEmbeddingsRecordsWorker.new.perform
. Note that it will make actual calls to VertexAI embeddings API. It runs ~30-40 mins locally in my testing.
Before | After |
---|---|
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Alexandru Croitor