Skip to content

Feature request: search_entity_labels — fuzzy entity search with degree #5

@magnus919

Description

@magnus919

Summary

Add a search_entity_labels tool that accepts one or more query strings and returns matching entity names from the graph along with their degree (relation count). This is a lightweight alternative to loading the full label list.

Proposed Interface

search_entity_labels(
    queries: list[str],   # one or more search terms
    limit: int = 10,      # max results per query
) -> list[{
    query: str,
    matches: list[{ entity: str, degree: int }]
}]

Motivation

The full entity label list is now 30,000+ entries and too large to load inline in agent contexts. Agents need a cheap way to ask "does LightRAG have a node for X?" before deciding whether to query, insert, or both.

Fuzzy matching matters because entity extraction produces variant forms — e.g. "Brynjolfsson" vs "Erik Brynjolfsson" vs "Erik J. Brynjolfsson". Exact-match lookups will miss these variants.

Degree matters because it signals whether a match is a well-connected hub or a leaf node. A hub entity is a much better query anchor than a low-degree one, and agents can use this to pick the best match when multiple variants exist.

Batch input (list of queries) matters for post-insert verification: after indexing a document, an agent can check 3–5 key entities in a single call rather than making N round trips.

Use Cases

  • Pre-query: confirm whether a concept is richly represented before choosing query mode (local vs. global vs. hybrid)
    • Post-insert: verify that key entities from a newly indexed document were extracted into the graph
      • Entity disambiguation: find what the graph actually calls something before forming a traversal query

Notes

  • Fuzzy matching could be implemented via substring search, case-insensitive prefix match, or a lightweight edit-distance approach — even simple substring matching would cover most cases
    • Degree can be retrieved from Neo4j (when used as the graph backend) with a straightforward size((n)--()) query

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions