Skip to content

[BUG] Intra-file CALLS edges incorrectly marked as external due to missing target ID resolution #2

@schneidermr

Description

@schneidermr

Describe the bug
All intra-file CALLS edges are misclassified as external even when the target symbol is defined in the same file. For example, a function that calls oauth2_scheme (a module-level variable defined 3 lines above in the same file) produces a CALLS edge with to_node_type: external instead of resolving to the actual LexicalNode.

Root cause: _extract_call_edges() in _ast_utils.py (line 371) stores the bare callee name string as target_id (e.g. "oauth2_scheme"). LexicalNode.make_id() produces structured IDs of the form var:a1b2c3d4e5f6 (sha1 of {tenant_id}:{repo_id}:{file}:{name}:{node_type}). When Neo4jWriter.write_edges() runs MERGE (tgt:LexicalNode {node_id: "oauth2_scheme"}), no match is found and Neo4j creates a new stub node with node_type = 'external'. The resolver that rewrites CALLS target_id values to proper node_id hashes (_resolve_call_targets(), present in the main codesteward agent) was never ported to codesteward-graph.

To reproduce

  1. Create a Python file with a module-level variable and a function that references it:
# auth.py
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")

async def get_current_user(token: str = Depends(oauth2_scheme)):
    ...
  1. Build the graph: graph_rebuild(repo_path=..., tenant_id="t", repo_id="r")
  2. Query referential edges for get_current_user:
    codebase_graph_query(query_type="referential", query="get_current_user", ...)
  3. Observe that the CALLS edge to oauth2_scheme has to_node_type: external and to_file: null, even though oauth2_scheme is defined in the same file.

Expected behavior
The CALLS edge should point to the LexicalNode for oauth2_scheme defined in auth.py. to_node_type should be variable and to_file should be auth.py.

Environment
codesteward version: codesteward-graph 0.2.2 / codesteward-mcp 0.2.2
Backend: Neo4j (also reproducible in stub mode — incorrect target_id is visible in raw ParseResult)
Transport: any
OS: any

Additional context
The fix requires a post-parse resolution pass in GraphBuilder.build_graph() (after all files are parsed, before write_edges()). The pass should:

  1. Build a name → node_id map from all collected LexicalNodes
  2. Rewrite any CALLS edge whose target_id equals a bare name string (i.e., doesn't start with a type prefix like fn, var, cls) to the corresponding resolved node_id
  3. Leave unresolved edges (genuinely external symbols) unchanged
    This resolver exists as _resolve_call_targets() in the codesteward-graph project and can be ported directly. The same issue affects all languages, not just Python — any call to a same-file symbol will be falsely marked external.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions