Skip to content

feat: resolve TypeScript tsconfig path aliases in import edges#155

Open
nuthalapativarun wants to merge 4 commits intosafishamsi:v3from
nuthalapativarun:feat/tsconfig-path-alias-resolution
Open

feat: resolve TypeScript tsconfig path aliases in import edges#155
nuthalapativarun wants to merge 4 commits intosafishamsi:v3from
nuthalapativarun:feat/tsconfig-path-alias-resolution

Conversation

@nuthalapativarun
Copy link
Copy Markdown

@nuthalapativarun nuthalapativarun commented Apr 9, 2026

Summary

Closes #147

Graphify previously treated aliased TypeScript imports (e.g. @/components/Button) as opaque strings, producing broken or missing import edges for any project using tsconfig.json paths mappings (Next.js, Vite, Create React App, etc.).

This PR adds tsconfig-aware alias resolution so those imports map to real module names in the graph.

Changes

graphify/detect.py

  • load_tsconfig_paths(root) — walks up the directory tree from the file being extracted to find tsconfig.json, parses compilerOptions.baseUrl and compilerOptions.paths, and returns an alias_prefix → resolved_path dict
  • resolve_ts_alias(import_path, alias_map) — replaces a matching alias prefix with its resolved filesystem path

graphify/extract.py

  • _import_js() — accepts an optional alias_map kwarg; resolves any alias before deriving the module name for the edge target
  • extract_js() — calls load_tsconfig_paths() for the file's directory; if aliases are found, wraps _import_js in a closure that captures the map and passes it as import_handler via dataclasses.replace

tests/

  • New fixture: tests/fixtures/tsconfig_alias/ — a minimal TypeScript project with tsconfig.json paths and a source file using @/ and @components/ aliases
  • 6 new tests in test_extract.py: load_tsconfig_paths discovery, empty-map fallback, resolve_ts_alias prefix matching, longer-prefix precedence, no-match pass-through, and an end-to-end extract_js test (skipped if tree-sitter-typescript is not installed)

Before / After

// src/pages/Home.ts
import Button from "@/components/Button";
import { useAuth } from "@/hooks/useAuth";
Edge target
Before @ (raw alias, broken)
After button, useauth (resolved module names)

Notes

  • Resolution is lazy: load_tsconfig_paths is only called for .js/.ts/.tsx files and only when a tsconfig.json exists in the ancestor tree. No performance impact on non-TypeScript projects.
  • Only the first glob target per alias is used (standard behaviour — projects almost never list multiple targets).
  • Non-alias imports (./local, react, etc.) are completely unaffected.

Test plan

  • test_load_tsconfig_paths_finds_config — finds tsconfig.json by walking up from subdirectory
  • test_load_tsconfig_paths_no_config — returns {} when no tsconfig exists
  • test_resolve_ts_alias_replaces_prefix@/hooks/useAuth/project/src/hooks/useAuth
  • test_resolve_ts_alias_longer_prefix_wins@components/Sidebar uses @components not @
  • test_resolve_ts_alias_no_match — relative and bare imports unchanged
  • test_extract_js_resolves_aliases — end-to-end: no raw @ in edge targets

…hamsi#147)

Add load_tsconfig_paths() to detect.py that walks up the directory
tree to find tsconfig.json and extracts compilerOptions.paths.
Add resolve_ts_alias() to map alias prefixes (e.g. @/*) to their
resolved filesystem paths.

Update extract_js() to load aliases for the file being processed and
pass them through a closure to _import_js(), so aliased imports like
@/components/Button resolve to real module names in the edge graph.
@nuthalapativarun nuthalapativarun force-pushed the feat/tsconfig-path-alias-resolution branch from 199c4a4 to 9505533 Compare April 9, 2026 17:14
@vhsantos26
Copy link
Copy Markdown

Ran graphify on a production React Native TypeScript codebase (~800 TS/TSX files, heavy use of compilerOptions.paths aliases like @features/*, @hooks/*, @components/*).

Current behavior (main):

  • imports_ratio: 0.0% (0 of 1,629 edges are imports)
  • 792 communities detected for 792 source files — essentially one community per file
  • 63% of nodes are leaves (degree=1), 9% fully isolated
  • Output: community names default to the largest file's title-cased filename, producing collisions and broken casing

Root cause matches exactly what this PR fixes: in _import_js, any import that doesn't start with . is treated as node_modules and dropped — so every aliased import (from "@features/...") becomes a dangling edge to a stub node that doesn't match any file node, silently pruned during graph build.

Would be very useful to have this merged. Happy to test the patch against the same corpus and report back the delta if that helps.

@vhsantos26
Copy link
Copy Markdown

Tested this PR end-to-end on a real TypeScript codebase (~1,400 TS/TSX files, 22 aliases in compilerOptions.paths). Found a silent failure that prevents the patch from taking effect in most real projects:

tsconfig.json is JSONC by default, not strict JSON. TypeScript's own spec allows line/block comments and trailing commas, and virtually every tsconfig generated by tsc --init, Next.js, Vite, React Native, etc. ships with /* ... */ annotations on every option.

json.loads(tsconfig_path.read_text(...)) raises JSONDecodeError, the except clause returns {}, aliases are never loaded, and the rest of the PR is a no-op. No error surfaces to the user.

Measured impact on a real corpus (after fixing locally):

imports_ratio:    3.0% → 23.8%   (10.3x, +4,786 edges)
isolated nodes:   361 → 53       (-85%)
communities:      531 → 133      (-75%, fragmentation collapsed)

The naive regex strip (/\*.*?\*/) breaks on string literals like "@assets/*" — the /* inside the string gets consumed. A string-aware pass handles it correctly:

raw = tsconfig_path.read_text(encoding="utf-8")
out: list[str] = []
i, n = 0, len(raw)
while i < n:
    c = raw[i]
    if c == '"':
        j = i + 1
        while j < n:
            if raw[j] == "\\" and j + 1 < n:
                j += 2
                continue
            if raw[j] == '"':
                j += 1
                break
            j += 1
        out.append(raw[i:j])
        i = j
    elif c == "/" and i + 1 < n and raw[i + 1] == "/":
        nl = raw.find("\n", i)
        i = n if nl == -1 else nl
    elif c == "/" and i + 1 < n and raw[i + 1] == "*":
        end = raw.find("*/", i + 2)
        i = n if end == -1 else end + 2
    else:
        out.append(c)
        i += 1
stripped = re.sub(r",\s*([}\]])", r"\1", "".join(out))
data = json.loads(stripped)

Handles: // line comments, /* */ block comments (including multi-line), trailing commas, and string literals containing /*, // or " (with escape handling).

Feel free to fold this into the PR — happy to open a follow-up PR against your branch if that's easier.

Real-world tsconfig.json files use JSONC syntax (// and /* */ comments,
trailing commas) by default. The plain json.loads() call silently returned
{} on any such file, making alias resolution a no-op in practice.

Replace with a string-aware JSONC stripper that handles line comments,
block comments (including multi-line), trailing commas, and string
literals containing comment-like sequences.
@nuthalapativarun
Copy link
Copy Markdown
Author

Thanks @vhsantos26 — you're right on both counts. The json.loads call fails silently on any real tsconfig (which is JSONC by default), making the whole PR a no-op in practice.

I've pushed a fix in 3fbfa6f that replaces the bare json.loads with a string-aware JSONC stripper handling // line comments, /* */ block comments (including multi-line), trailing commas, and string literals containing comment-like sequences — based on the approach you shared.

The numbers you measured (imports_ratio 3% → 23.8%, isolated nodes -85%, communities -75%) are exactly the kind of signal I was hoping this would produce on a real corpus. Appreciate you testing it end-to-end.

@qodo-ai-reviewer
Copy link
Copy Markdown

Hi, extract() returns cached JS/TS extraction results keyed only by the source file hash, but extract_js() now depends on tsconfig.json (paths/baseUrl). Changing tsconfig.json will not invalidate cached results, so import edges can remain wrong until the source files change or the cache is cleared.

Severity: action required | Category: correctness

How to fix: Include tsconfig in cache key

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

JS/TS extraction is now tsconfig-dependent (paths/baseUrl), but the extraction cache is keyed only by the source file contents. This causes stale imports_from edges when tsconfig.json changes.

Issue Context

  • extract() uses load_cached() and skips calling extract_js() when a cache entry exists.
  • load_cached() is keyed by file_hash(path) which hashes only the code file contents + resolved path.
  • extract_js() calls load_tsconfig_paths(path.parent) and changes edge targets based on tsconfig.

Fix Focus Areas

  • graphify/extract.py[1103-1119]
  • graphify/extract.py[2653-2660]
  • graphify/cache.py[10-17]
  • graphify/cache.py[27-44]

Suggested approach

  • Make the per-file cache key include the effective tsconfig.json used for that file (e.g., hash/mtime of the found tsconfig.json + baseUrl/paths content), or
  • Store alias-map-related metadata in the cached result and validate it on load (recompute if tsconfig changed), or
  • Disable cache reuse for JS/TS when a tsconfig.json is present unless the cache is scoped to the tsconfig content.

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Qodo code review - free for open-source.

extract_js() now resolves aliases from tsconfig.json, but the cache
was keyed only on source file contents. A tsconfig change would leave
stale import edges in the cache until the source files themselves
changed.

Add an extra_key parameter to file_hash/load_cached/save_cached.
In the extraction loop, JS/TS files with a non-empty alias map hash
the serialised alias map and mix it into the cache key, so any change
to compilerOptions.paths or baseUrl triggers a cache miss.
@nuthalapativarun
Copy link
Copy Markdown
Author

Good catch @qodo-ai-reviewer — fixed in 586c1c9.

The cache key now includes a SHA256 of the serialised alias map for JS/TS files. load_cached / save_cached accept an extra_key: bytes parameter; the extraction loop computes it from load_tsconfig_paths() before the cache lookup, so any change to compilerOptions.paths or baseUrl produces a different key and forces a fresh extraction.

Happy to hear the other issues you found as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support TypeScript path alias (tsconfig paths) compile

3 participants