Skip to content

Conversation

gabritto
Copy link
Member

@gabritto gabritto commented Sep 29, 2025

In this PR:

  • Port of DocumentPositionMapper from Strada
  • Make go to definition follow source maps
  • LS/snapshot changes to make it all work

The most notable change is that now snapshots are allowed to read a file from the file system. That file is then cached, such that all reads of that file return the same result. This is needed because we need to read source map files, as those are not part of a Program. Document position mappers are cached in the language service so we don't have to rebuild them multiple times per LS. In Strada, this all used to happen in the project or project service. We used to cache source map files, and therefore we also had to watch them, which greatly complicates things. By allowing snapshot/LS to read a file and cache information about it just for the duration of the snapshot/LS, we don't need to watch those files.

This PR supersedes #1710 (and is inspired by it).

// line = line < 0 ? 0 : line >= lineStarts.length ? lineStarts.length - 1 : line;
// }
panic(fmt.Sprintf("Bad line number. Line: %d, lineStarts.length: %d.", line, len(lineStarts)))
if allowEdits {
Copy link
Member Author

@gabritto gabritto Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a direct port of Strada and allows cases where the source file and source map are out of sync, see test declarationMapsOutOfDateMapping.


// Originally: /^data:(?:application\/json;charset=[uU][tT][fF]-8;base64,([A-Za-z0-9+/=]+)$)?/
// Should have been /^data:(?:application\/json;(?:charset=[uU][tT][fF]-8;)?base64,([A-Za-z0-9+/=]+)$)?/
func tryParseBase46Url(url string) (parseableUrl string, isBase64Url bool) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Strada regex that this function is replacing had a problem. The regex base64UrlRegExp as written required a charset=utf-8 to always be specified for us to decode an inline source map. But we don't even specify a charset when we write source maps in the emitter, so this means we wouldn't parse our own source map urls.

@@ -0,0 +1,314 @@
package sourcemap
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is also basically a close port of Strada.

@gabritto gabritto marked this pull request as ready for review September 29, 2025 22:28
@Copilot Copilot AI review requested due to automatic review settings September 29, 2025 22:28
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ports source map position mapping from Strada and implements go-to-definition source mapping functionality. It enables the language service to follow source maps when navigating to definitions, mapping from generated files (like .d.ts) back to their original source files.

Key changes:

  • Introduces DocumentPositionMapper for source map position mapping
  • Updates go-to-definition to use source map mapping when available
  • Extends snapshots to read and cache files from the filesystem for source map files

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/sourcemap/source_mapper.go New DocumentPositionMapper implementation for source map position mapping
internal/ls/source_map.go New language service source map utilities for position mapping
internal/ls/languageservice.go Updated to support source map functionality with file reading capabilities
internal/ls/definition.go Modified go-to-definition to use source map mapping
internal/project/snapshot.go Added file system access methods for reading source map files
internal/project/snapshotfs.go Enhanced with memoized file reading for source map caching
internal/scanner/scanner.go Extended position computation with edit tolerance for source maps
internal/core/core.go Added DeduplicateSorted utility function
testdata/baselines/reference/fourslash/goToDefinition/*.baseline.jsonc Updated test baselines showing source map navigation working

if file, ok := s.diskFiles[s.toPath(fileName)]; ok {
return file
}
newEntry := memoizedDiskFile(sync.OnceValue(func() *diskFile {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine, but I had imagined a change to cachedvfs.FS instead. Was there a reason not to put the cache there? I’m not sure why there wasn’t a cache there to begin with—presumably the result of every ReadFile gets stored somewhere in memory, and my memory is that the underlying memory for the string will be shared when it’s assigned from the ReadFile result to the cache to the diskFile to the SourceFile and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were other usages of cachedvfs.FS where I wasn't sure caching file reads would make sense or how it would affect those usages, and Sheetal seemed to think we shouldn't cache files at that level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah caching at cached.vfs level means you need to clear it on "writeFile" otherwise tsc -b will run into issues if we dont handle that well - which would add to cost right? and normally in those scenarios the text gets cached in "sourceFile" so another cache is generally not needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that makes sense; there are already some weird inconsistencies with cachedvfs (FileExists can be out of sync with Stat can be out of sync with ReadFile) but I guess we can avoid growing those any more.

possibleMapLocations = append(possibleMapLocations, generatedFileName+".map")
for _, location := range possibleMapLocations {
mapFileName := tspath.GetNormalizedAbsolutePath(location, tspath.GetDirectoryPath(generatedFileName))
if mapFileContents, ok := host.ReadFile(mapFileName); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem with having these in LS and not project layer and watching these files is that "navigation": leads to discarding these files and having to reparse source maps etc and use to be bottle neck early on when it lived in LS. having it life of scriptInfo its associated with or till things change, meant we could avoid having to parse things again and again and that really helped with perf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this work with the new snapshot architecture? I couldn't figure out exactly where this should live in Corsa. It used to live in ProjectService in Strada, and we don't have that anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would live in session as thats kind of projectService now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the file cache just gets passed from Snapshot to Snapshot during cloning. At the beginning of each clone, the list of files invalidated by a watch change gets refreshed in the new iteration of the cache, and at the end of clones where the set of files for any project was reevaluated, we drop files that are no longer referenced (i.e., they were seen by a ReadFile) by any project. If we want to persist map files, it would be best to have them work roughly the same way:

  • they should be read in during snapshot building
  • they should be passed to the next snapshot during cloning
  • they need to be watched so changed files can be refreshed during that clone
  • we need a way of determining when they're no longer needed so they can be dropped from the file cache

Practically, I think this means making these changes:

  • undo the readFiles change in snapshotfs
  • instead, the caller of LS functionality that might need source mapping needs to check the result, see if mapping is needed, and go back to the session and ask for a new snapshot with the requested map files present, and then do the mapping on top of that snapshot; pseudocode:
    ls, err := session.GetLanguageService(goToDefRequestURI)
    result := ls.ProvideDefinitions(goToDefRequestURI, /*...*/)
    if dtsFileNamesToMap := getDtsFileNamesToMap(result); len(dtsFileNamesToMap) > 0 {
      mapper := session.GetSourceMapper(dtsFileNamesToMap)
      // Session will check if the necessary map files are already
      // in the current snapshot, and return a source mapper backed
      // by that snapshot if so. Otherwise, it will clone the snapshot
      // in order to include those files, and return a source mapper
      // backed by the new snapshot.
      return mapMappablePositions(result, mapper)
    }
    return result
    (Technically, this should capture the snapshot backing ls and use/clone that one to back the mapper, not whichever one happens to be current in the session at the end of ProvideDefinitions.)
  • During the snapshot clone that reads in new map files, the *diskFile that represents its input source file should also get a field written with the map file’s path (like ScriptInfo in Strada used to have). When the input file gets cleared from the snapshotfs file cache, its map file should get cleared with it.
  • Watches may need to be updated when adding/removing maps (wait for Simplify and consolidate LSP watcher registrations #1733)

That’s a pretty big lift, so I would not block this PR on it, personally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with a simpler solution for now even if it's slower. So if this PR looks good, then that's good in my eyes

@gabritto
Copy link
Member Author

gabritto commented Oct 3, 2025

I'm gonna merge this so we have something working for users while I try to get the reusable cache version working, because I don't know how long it's gonna take.

@gabritto gabritto added this pull request to the merge queue Oct 3, 2025
Merged via the queue into main with commit d5215bc Oct 3, 2025
22 checks passed
@gabritto gabritto deleted the gabritto/def1 branch October 3, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants