-
Notifications
You must be signed in to change notification settings - Fork 767
Add "package deduplication", --disablePackageDeduplication #2361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...es/reference/submodule/fourslash/goToDefinition/duplicatePackageServices.baseline.jsonc.diff
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements package deduplication for the TypeScript compiler port to address issue #1080. When multiple packages with the same name and version are installed in different node_modules locations, the compiler now treats them as compatible by replacing duplicate ASTs with a canonical version.
Key changes:
- Adds
--disablePackageDeduplicationflag (default: false) to control the behavior - Modifies package ID format to include the resolved file path within the package, making IDs more specific
- Implements deduplication during file collection by replacing duplicate file references with canonical ones
- Updates program incremental update logic to trigger full rebuilds when deduplicated files change
Reviewed changes
Copilot reviewed 74 out of 74 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
internal/core/compileroptions.go |
Adds new DisablePackageDeduplication compiler option field |
internal/tsoptions/parsinghelpers.go |
Adds parsing logic for the new option |
internal/tsoptions/declscompiler.go |
Defines command-line option metadata with appropriate flags |
internal/diagnostics/*.{json,go} |
Adds diagnostic message for the new option |
internal/module/resolver.go |
Changes package ID generation to use resolved file path instead of package directory, making SubModuleName more specific |
internal/compiler/program.go |
Implements redirect target tracking and updates incremental compilation logic to handle deduplicated files |
internal/compiler/fileloader.go |
Adds data structures for tracking package deduplication mappings |
internal/compiler/filesparser.go |
Implements core deduplication logic that tracks canonical files and replaces duplicates |
internal/fourslash/tests/manual/duplicatePackageServices_fileChanges_test.go |
Adds test for language service behavior with file changes in deduplicated packages |
internal/fourslash/_scripts/*.txt |
Updates test lists to reflect new passing tests |
testdata/baselines/reference/**/*.{js,txt,diff,jsonc} |
Updates baseline files showing new package ID format and deduplicated file behavior |
| if canonicalPath, ok := deduplicatedPathMap[f.Path()]; ok { | ||
| return f.Path() != canonicalPath | ||
| } | ||
| return false |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deduplication logic maps duplicate paths to canonical files in filesByPath, but it doesn't verify that the canonical file was actually parsed and loaded. If a canonical file failed to parse or wasn't included for some reason, duplicate paths would map to nil or an invalid entry. Consider adding a check or warning if the canonical file doesn't exist in filesByPath.
| if canonicalPath, ok := deduplicatedPathMap[f.Path()]; ok { | |
| return f.Path() != canonicalPath | |
| } | |
| return false | |
| canonicalPath, ok := deduplicatedPathMap[f.Path()] | |
| if !ok { | |
| return false | |
| } | |
| // Only delete a duplicate when the canonical file actually exists in filesByPath. | |
| if f.Path() == canonicalPath { | |
| return false | |
| } | |
| if _, exists := filesByPath[canonicalPath]; !exists { | |
| return false | |
| } | |
| return true |
| if canonical, exists := packageIdToCanonicalPath[pkgId]; exists { | ||
| if _, alreadyRecorded := sourceFileToPackageName[resolvedPath]; !alreadyRecorded { | ||
| sourceFileToPackageName[resolvedPath] = packageName | ||
| if resolvedPath != canonical { | ||
|
|
||
| deduplicatedPathMap[resolvedPath] = canonical | ||
| redirectTargetsMap[canonical] = append(redirectTargetsMap[canonical], resolution.ResolvedFileName) | ||
| } | ||
| } | ||
| } else { | ||
| packageIdToCanonicalPath[pkgId] = resolvedPath | ||
| sourceFileToPackageName[resolvedPath] = packageName | ||
| deduplicatedPathMap[resolvedPath] = resolvedPath | ||
| } |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for deduplication relies on recording canonical paths during file collection, but canonical files are chosen in a non-deterministic manner based on the order resolutions are encountered. Consider adding documentation or a comment explaining which file becomes canonical (first encountered), since this ordering might matter for debugging or when understanding the behavior. Also consider whether sorting package IDs or using a deterministic ordering would improve consistency across runs.
| ->a : import("/p2/node_modules/typescript-fsa/index").A | ||
| +>a : import("/p1/node_modules/typescript-fsa/index").A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically fixed by #1604 in that that PR would instead say import("typescript-fsa") instead.
Fixes #1080
The old compiler did "package deduplication" through source file redirection. This allowed two duplicate packages in different parts of
node_modules(but with the same name / package.json version) to be "compatible" by way of them being the same ASTs even though they are at different paths (and so at runtime, probably are not compatible).The old compiler did this by essentially proxying the original
SourceFileviaObject.create(sourceFile)(the original file is the prototype 😨), then overridingsymbol,fileName, etc, on that new instance.The old compiler also chose the "winning"/canonical packages as they were walked, saving work and producing a deterministic result. The new compiler's loading is fully concurrent, so that is a challenge.
We can't really do the same thing as we did before.
Instead, this PR lets loading proceed as "normal", no deduplication. Then, during
collectFiles, it deduplicates packages. Deduplication is done by quite literally replacing duplicates with the canonical files' ASTs. This means looking them up later will give the "right" source file, with the "right" nodes and symbols, as before.This seems to be sufficient; it's basically just what we did before, but less optimal, and without creating a completely new SourceFile node with different.
I've also made a flag that can disable this behavior,
--disablePackageDeduplication, defaulting tofalse, for those who don't want this behavior (which, IMO is objectively correct, but, it is breaking / annoying).I tried also reintroducing the concept of "redirected files" by moving data off of
SourceFileinto an embedded pointer, but that hit some walls where things didn't work. I'm not sure if it's worth doing if this method seems to work fine?