Summary
Add a search_entity_labels tool that accepts one or more query strings and returns matching entity names from the graph along with their degree (relation count). This is a lightweight alternative to loading the full label list.
Proposed Interface
search_entity_labels(
queries: list[str], # one or more search terms
limit: int = 10, # max results per query
) -> list[{
query: str,
matches: list[{ entity: str, degree: int }]
}]
Motivation
The full entity label list is now 30,000+ entries and too large to load inline in agent contexts. Agents need a cheap way to ask "does LightRAG have a node for X?" before deciding whether to query, insert, or both.
Fuzzy matching matters because entity extraction produces variant forms — e.g. "Brynjolfsson" vs "Erik Brynjolfsson" vs "Erik J. Brynjolfsson". Exact-match lookups will miss these variants.
Degree matters because it signals whether a match is a well-connected hub or a leaf node. A hub entity is a much better query anchor than a low-degree one, and agents can use this to pick the best match when multiple variants exist.
Batch input (list of queries) matters for post-insert verification: after indexing a document, an agent can check 3–5 key entities in a single call rather than making N round trips.
Use Cases
- Pre-query: confirm whether a concept is richly represented before choosing query mode (local vs. global vs. hybrid)
-
- Post-insert: verify that key entities from a newly indexed document were extracted into the graph
-
-
- Entity disambiguation: find what the graph actually calls something before forming a traversal query
Notes
- Fuzzy matching could be implemented via substring search, case-insensitive prefix match, or a lightweight edit-distance approach — even simple substring matching would cover most cases
-
- Degree can be retrieved from Neo4j (when used as the graph backend) with a straightforward
size((n)--()) query
Summary
Add a
search_entity_labelstool that accepts one or more query strings and returns matching entity names from the graph along with their degree (relation count). This is a lightweight alternative to loading the full label list.Proposed Interface
Motivation
The full entity label list is now 30,000+ entries and too large to load inline in agent contexts. Agents need a cheap way to ask "does LightRAG have a node for X?" before deciding whether to query, insert, or both.
Fuzzy matching matters because entity extraction produces variant forms — e.g.
"Brynjolfsson"vs"Erik Brynjolfsson"vs"Erik J. Brynjolfsson". Exact-match lookups will miss these variants.Degree matters because it signals whether a match is a well-connected hub or a leaf node. A hub entity is a much better query anchor than a low-degree one, and agents can use this to pick the best match when multiple variants exist.
Batch input (list of queries) matters for post-insert verification: after indexing a document, an agent can check 3–5 key entities in a single call rather than making N round trips.
Use Cases
Notes
size((n)--())query