Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggest searches against completion fields with a synonym set result in null pointer exceptions #114651

Open
canada-j opened this issue Oct 11, 2024 · 2 comments
Assignees
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Relevance/Suggesters "Did you mean" and suggestions as you type Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@canada-j
Copy link

Elasticsearch Version

8.13.2

Installed Plugins

No response

Java Version

bundled

OS Version

Elastic Cloud

Problem Description

When performing a suggest search against a completion field whose search_analyzer contains a synonym_set, Elasticsearch returns a 500 response with a null pointer exception.

This behavior does not occur when using inline synonyms, or synonym_path, and the synonym_set behaves as expected when using the _analyze endpoint.

Expected: suggest searches on completion fields that utilize a synonym set return a non-error response, and the suggest prefix is correctly tokenized according to the rules of the synonym set.

Actual: a 500 response is returned, with a null_pointer_exception and a reason of Cannot invoke "org.elasticsearch.index.analysis.AnalyzerComponents.getCharFilters()" because "components" is null

Steps to Reproduce

Create a synonym set with an arbitrary number of synonyms ranging from 0 - n.

Create an index with the following details:

  • A filter containing:
    • A type of synonym or synonym_graph
    • "updateable": "true"
    • A synonym_set using the name of the previously created synonym set
  • An analyzer containing:
    • A filter with the previously created filter
    • `"tokenizer": "standard"
  • A completion field containing:
    • A search_analyzer using the name of the previously created analyzer

Create a document in this index with an arbitrary value for the previously created completion field

Make a request to the _search endpoint of the index using a suggest query with an arbitrary prefix

The following should produce a working example of the issue:

# Create synonym set
PUT _synonyms/test_synonyms
{
  "synonyms_set": [
    {
      "id": "test-1",
      "synonyms": "jerry, gary"
    }
  ]
}

# (Re)create index
DELETE test_synonym_index
PUT test_synonym_index
{
  "settings": {
    "analysis": {
      "filter": {
        "filter-synonym-completion": {
          "type": "synonym_graph",
          "updateable": "true",
          "synonyms_set": "test_synonyms"
        }
      },
      "analyzer": {
        "analyzer-completion-search": {
          "filter": [
            "lowercase",
            "filter-synonym-completion"
          ],
          "tokenizer": "standard"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "titleCompletion": {
        "type": "completion",
        "search_analyzer": "analyzer-completion-search"
      }
    }
  }
}

# Create a document
PUT test_synonym_index/_doc/1?refresh
{
  "titleCompletion" : {
    "input": "Gary is the title of this document"
  }
}

# Attempt a suggestion search
POST test_synonym_index/_search?filter_path=suggest
{
  "suggest": {
    "title": {
      "prefix": "jerry",
      "completion": {
        "field": "titleCompletion"
      }
    }
  }
}

It is worth noting that during index creation, if the synonym filter is created using in-line or synonyms_path, and updateable is omitted, the suggest search will work as expected:

# (Re)create index
DELETE test_synonym_index
PUT test_synonym_index
{
  "settings": {
    "analysis": {
      "filter": {
        "filter-synonym-completion": {
          "type": "synonym_graph",
          "synonyms": ["jerry, gary"]
        }
      },
      "analyzer": {
        "analyzer-completion-search": {
          "filter": [
            "lowercase",
            "filter-synonym-completion"
          ],
          "tokenizer": "standard"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "titleCompletion": {
        "type": "completion",
        "search_analyzer": "analyzer-completion-search"
      }
    }
  }
}

Logs (if relevant)

The exact error response from Elastic is as follows:

{
  "error": {
    "root_cause": [
      {
        "type": "null_pointer_exception",
        "reason": """Cannot invoke "org.elasticsearch.index.analysis.AnalyzerComponents.getCharFilters()" because "components" is null"""
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "test_synonym_index",
        "node": "RrPyixOFR-u3vsvSanASqg",
        "reason": {
          "type": "null_pointer_exception",
          "reason": """Cannot invoke "org.elasticsearch.index.analysis.AnalyzerComponents.getCharFilters()" because "components" is null"""
        }
      }
    ],
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": """Cannot invoke "org.elasticsearch.index.analysis.AnalyzerComponents.getCharFilters()" because "components" is null""",
      "caused_by": {
        "type": "null_pointer_exception",
        "reason": """Cannot invoke "org.elasticsearch.index.analysis.AnalyzerComponents.getCharFilters()" because "components" is null"""
      }
    }
  },
  "status": 500
}
@canada-j canada-j added >bug needs:triage Requires assignment of a team area label labels Oct 11, 2024
@gwbrown gwbrown added :Search Relevance/Suggesters "Did you mean" and suggestions as you type and removed needs:triage Requires assignment of a team area label labels Oct 15, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 15, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@benwtrent benwtrent added the priority:normal A label for assessing bug priority to be used by ES engineers label Dec 5, 2024
@mayya-sharipova mayya-sharipova self-assigned this Jan 10, 2025
@mayya-sharipova
Copy link
Contributor

This happens because CompletionAnalyzer that wraps reloadable synonym analyzer uses a default reuse strategy that doesn't consult a wrapped analyzer strategy if analyzer components need to be updated. For a fix we need to do the following:

  • In Lucene: Add a new wrapping reusable strategy and add to an optional constructor to CompletionAnalyzer that uses it. I've created a PR for this.
  • In Elasticsearch: create CompletionAnalyzer with this new strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Relevance/Suggesters "Did you mean" and suggestions as you type Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

5 participants