Skip to content

Set FromHash from zoektSHA for indexing

Terri Chu requested to merge tchu-fix-set-fromhash into main

Background

Related to #5 (closed)

While working on another issue, I found that the FromHash was not being set. This results in a full indexing occurring every time.

What this MR does

  • during indexing, grab the recorded zoekt SHA from the repository metadata
  • if the zoekt SHA is not found, set FromHash to "" to trigger a full repository index
  • if the zoekt SHA is found, ask Gitaly if it still exists before setting FromHash to the SHA (or "" if the SHA is gone which likely means a force push occurred)
  • added a custom error type

How to test

I manually tested this using gdk using the flightJS project to test with id of 7

Apply this diff to see print statements
diff --git a/internal/indexer/indexer.go b/internal/indexer/indexer.go
index 3836fe5..08a5784 100644
--- a/internal/indexer/indexer.go
+++ b/internal/indexer/indexer.go
@@ -3,6 +3,7 @@ package indexer
 import (
        "context"
        "errors"
+       "fmt"
 
        custom_error "gitlab.com/gitlab-org/gitlab-zoekt-indexer/internal"
        "gitlab.com/gitlab-org/gitlab-zoekt-indexer/internal/gitaly"
@@ -159,6 +160,8 @@ func (i *Indexer) indexRepository() error {
                i.gitalyClient.FromHash = ""
        }
 
+       fmt.Printf("FromHash %v\nToHash %v\n", i.gitalyClient.FromHash, i.gitalyClient.ToHash)
+
        err = i.gitalyClient.EachFileChange(putFunc, delFunc)
 
        if err != nil {

GDK

  1. stop zoekt indexer on gdk: gdk stop zoekt-dynamic-indexserver-development
  2. find the project directory from rails console: "#{Project.find(7).repository.disk_path}.git"

zoekt indexer

  1. run the server: make watch-run listen=:6061 index_dir=<REPLACE_WITH_GDK_DIR>/zoekt-data/development/index
  2. cleanup any existing indexed data from zoekt: curl -XPOST -H 'Content-Type: application/json' http://127.0.0.1:6061/indexer/truncate
  3. index the project
    ➜ curl -XPOST -d '{"GitalyConnectionInfo": {"Address": "unix:/<REPLACE_WITH_GDK_DIR>/praefect.socket", "Storage":    "default", "Path": "@hashed/79/02/7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451.git"}, "RepoId":7, "FileSizeLimit": 2097152, "Timeout": "1h"}' -H 'Content-Type: application/json' http://127.0.0.1:6061/indexer/index
  4. note that FromHash is empty string, and ToHash is populated
  5. push a change to the project (either in the UI or git)
  6. index the project again
  7. note that FromHash and ToHash are populated
Edited by Terri Chu

Merge request reports

Loading