Skip to content

Support language filter for blob searches

What does this MR do and why?

Related to #342648 (closed)

Allows the language param to be passed into the SearchController so that it can be used by the frontend for blobs scoped searches. This will allow the code/blobs scope search results to be filtered by language.

Note: This feature is currently behind the disabled by default Feature Flag search_blobs_language_aggregation

Screenshots or screen recordings

Example Elasticsearch query generated
{
  "query": {
    "bool": {
      "must": {
        "simple_query_string": {
          "_name": "blob:match:search_terms",
          "fields": [
            "blob.content",
            "blob.file_name",
            "blob.path"
          ],
          "query": "test",
          "default_operator": "and"
        }
      },
      "must_not": [],
      "should": [],
      "filter": [
        {
          "has_parent": {
            "_name": "blob:authorized:project",
            "parent_type": "project",
            "query": {
              "bool": {
                "should": [
                  {
                    "bool": {
                      "filter": [
                        {
                          "term": {
                            "visibility_level": {
                              "_name": "blob:authorized:project:any",
                              "value": 0
                            }
                          }
                        },
                        {
                          "terms": {
                            "_name": "blob:authorized:project:repository:enabled_or_private",
                            "repository_access_level": [
                              20,
                              10
                            ]
                          }
                        }
                      ]
                    }
                  },
                  {
                    "bool": {
                      "_name": "blob:authorized:project:visibility:10:repository:access_level",
                      "filter": [
                        {
                          "term": {
                            "visibility_level": {
                              "_name": "blob:authorized:project:visibility:10",
                              "value": 10
                            }
                          }
                        },
                        {
                          "terms": {
                            "_name": "blob:authorized:project:visibility:10:repository:access_level:enabled_or_private",
                            "repository_access_level": [
                              20,
                              10
                            ]
                          }
                        }
                      ]
                    }
                  },
                  {
                    "bool": {
                      "_name": "blob:authorized:project:visibility:20:repository:access_level",
                      "filter": [
                        {
                          "term": {
                            "visibility_level": {
                              "_name": "blob:authorized:project:visibility:20",
                              "value": 20
                            }
                          }
                        },
                        {
                          "terms": {
                            "_name": "blob:authorized:project:visibility:20:repository:access_level:enabled_or_private",
                            "repository_access_level": [
                              20,
                              10
                            ]
                          }
                        }
                      ]
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "term": {
            "type": {
              "_name": "doc:is_a:blob",
              "value": "blob"
            }
          }
        },
        {
          "terms": {
            "_name": "blob:match:languages",
            "blob.language": [
              "Markdown"
            ]
          }
        }
      ]
    }
  },
  "size": 20,
  "from": 0,
  "sort": [
    "_score"
  ],
  "highlight": {
    "pre_tags": [
      "gitlabelasticsearch→"
    ],
    "post_tags": [
      "←gitlabelasticsearch"
    ],
    "number_of_fragments": 0,
    "fields": {
      "blob.content": {},
      "blob.file_name": {}
    }
  },
  "aggs": {
    "language": {
      "composite": {
        "sources": [
          {
            "language": {
              "terms": {
                "field": "blob.language"
              }
            }
          }
        ]
      }
    }
  }
}

How to set up and validate locally

  1. Ensure that gdk is setup for Elasticsearch and that Advanced Search is enabled via the Admin Setting
  2. Make sure to index all projects prior to testing (bundle exec rake gitlab:elastic:index)
  3. Make sure the feature flag search_blobs_language_aggregation is enabled locally
  4. Run a search for test in the code tab: http://gdk.test:3000/search?scope=blobs&search=test
  5. Validate multiple language types come back in results
  6. Run a search for test in the code tab but add a language to the URL: http://gdk.test:3000/search?scope=blobs&search=test&language[]=Markdown
  7. Validate only that language comes back in results
  8. disable the feature flag search_blobs_language_aggregation
  9. Run a search for test in the code tab but add a language to the URL: http://gdk.test:3000/search?scope=blobs&search=test&language[]=Markdown
  10. Validate multiple language types come back in result and the parameter is ignored

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports

Loading