Skip to content

Change the TanukiBot's distance function

What does this MR do and why?

This is the first MR for https://gitlab.com/gitlab-org/gitlab/-/issues/410581+.

It changes TanukiBot's distance function from inner_product to cosine per OpenAI docs recommendation.

Follow-up MR: Add index to embeddings (!122035 - closed)

Screenshots or screen recordings

After running the following commands, we got these results:

current_user = User.first; client = ::Gitlab::Llm::OpenAi::Client.new(current_user); question = 'What is Fork?'; embeddings_result = client.embeddings(input: question); question_embedding = embeddings_result['data'].first['embedding']; 
Embedding::TanukiBotMvc.neighbor_for(question_embedding, limit: 7).pluck(:id)
Before After
Using inner_product as distance function Using cosine as distance function
[6909, 12665, 6913, 6910, 7125, 6912, 6825] [6909, 12665, 6913, 6910, 7125, 6912, 6825]

How to set up and validate locally

N/A

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Bojan Marjanovic

Merge request reports

Loading