Skip to content

Conversation

@jakebailey
Copy link
Member

@jakebailey jakebailey commented Dec 12, 2025

Fixes #1080

The old compiler did "package deduplication" through source file redirection. This allowed two duplicate packages in different parts of node_modules (but with the same name / package.json version) to be "compatible" by way of them being the same ASTs even though they are at different paths (and so at runtime, probably are not compatible).

The old compiler did this by essentially proxying the original SourceFile via Object.create(sourceFile) (the original file is the prototype 😨), then overriding symbol, fileName, etc, on that new instance.

The old compiler also chose the "winning"/canonical packages as they were walked, saving work and producing a deterministic result. The new compiler's loading is fully concurrent, so that is a challenge.

We can't really do the same thing as we did before.

Instead, this PR lets loading proceed as "normal", no deduplication. Then, during collectFiles, it deduplicates packages. Deduplication is done by quite literally replacing duplicates with the canonical files' ASTs. This means looking them up later will give the "right" source file, with the "right" nodes and symbols, as before.

This seems to be sufficient; it's basically just what we did before, but less optimal, and without creating a completely new SourceFile node with different.

I've also made a flag that can disable this behavior, --disablePackageDeduplication, defaulting to false, for those who don't want this behavior (which, IMO is objectively correct, but, it is breaking / annoying).


I tried also reintroducing the concept of "redirected files" by moving data off of SourceFile into an embedded pointer, but that hit some walls where things didn't work. I'm not sure if it's worth doing if this method seems to work fine?

@jakebailey jakebailey changed the title Add "package deduplication", sorta Add "package deduplication" Dec 15, 2025
@jakebailey jakebailey changed the title Add "package deduplication" Add "package deduplication", --disablePackageDeduplication Dec 15, 2025
@jakebailey jakebailey marked this pull request as ready for review December 17, 2025 03:18
Copilot AI review requested due to automatic review settings December 17, 2025 03:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements package deduplication for the TypeScript compiler port to address issue #1080. When multiple packages with the same name and version are installed in different node_modules locations, the compiler now treats them as compatible by replacing duplicate ASTs with a canonical version.

Key changes:

  • Adds --disablePackageDeduplication flag (default: false) to control the behavior
  • Modifies package ID format to include the resolved file path within the package, making IDs more specific
  • Implements deduplication during file collection by replacing duplicate file references with canonical ones
  • Updates program incremental update logic to trigger full rebuilds when deduplicated files change

Reviewed changes

Copilot reviewed 74 out of 74 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/core/compileroptions.go Adds new DisablePackageDeduplication compiler option field
internal/tsoptions/parsinghelpers.go Adds parsing logic for the new option
internal/tsoptions/declscompiler.go Defines command-line option metadata with appropriate flags
internal/diagnostics/*.{json,go} Adds diagnostic message for the new option
internal/module/resolver.go Changes package ID generation to use resolved file path instead of package directory, making SubModuleName more specific
internal/compiler/program.go Implements redirect target tracking and updates incremental compilation logic to handle deduplicated files
internal/compiler/fileloader.go Adds data structures for tracking package deduplication mappings
internal/compiler/filesparser.go Implements core deduplication logic that tracks canonical files and replaces duplicates
internal/fourslash/tests/manual/duplicatePackageServices_fileChanges_test.go Adds test for language service behavior with file changes in deduplicated packages
internal/fourslash/_scripts/*.txt Updates test lists to reflect new passing tests
testdata/baselines/reference/**/*.{js,txt,diff,jsonc} Updates baseline files showing new package ID format and deduplicated file behavior

Comment on lines +461 to +464
if canonicalPath, ok := deduplicatedPathMap[f.Path()]; ok {
return f.Path() != canonicalPath
}
return false
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deduplication logic maps duplicate paths to canonical files in filesByPath, but it doesn't verify that the canonical file was actually parsed and loaded. If a canonical file failed to parse or wasn't included for some reason, duplicate paths would map to nil or an invalid entry. Consider adding a check or warning if the canonical file doesn't exist in filesByPath.

Suggested change
if canonicalPath, ok := deduplicatedPathMap[f.Path()]; ok {
return f.Path() != canonicalPath
}
return false
canonicalPath, ok := deduplicatedPathMap[f.Path()]
if !ok {
return false
}
// Only delete a duplicate when the canonical file actually exists in filesByPath.
if f.Path() == canonicalPath {
return false
}
if _, exists := filesByPath[canonicalPath]; !exists {
return false
}
return true

Copilot uses AI. Check for mistakes.
Comment on lines +358 to +371
if canonical, exists := packageIdToCanonicalPath[pkgId]; exists {
if _, alreadyRecorded := sourceFileToPackageName[resolvedPath]; !alreadyRecorded {
sourceFileToPackageName[resolvedPath] = packageName
if resolvedPath != canonical {

deduplicatedPathMap[resolvedPath] = canonical
redirectTargetsMap[canonical] = append(redirectTargetsMap[canonical], resolution.ResolvedFileName)
}
}
} else {
packageIdToCanonicalPath[pkgId] = resolvedPath
sourceFileToPackageName[resolvedPath] = packageName
deduplicatedPathMap[resolvedPath] = resolvedPath
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for deduplication relies on recording canonical paths during file collection, but canonical files are chosen in a non-deterministic manner based on the order resolutions are encountered. Consider adding documentation or a comment explaining which file becomes canonical (first encountered), since this ordering might matter for debugging or when understanding the behavior. Also consider whether sorting package IDs or using a deterministic ordering would improve consistency across runs.

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +8
->a : import("/p2/node_modules/typescript-fsa/index").A
+>a : import("/p1/node_modules/typescript-fsa/index").A
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically fixed by #1604 in that that PR would instead say import("typescript-fsa") instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deduplicate doubly-installed packages

2 participants