Skip to content

Conversation

@m4rch3n1ng
Copy link
Contributor

@m4rch3n1ng m4rch3n1ng commented Mar 12, 2025

support the textDocument/rename and textDocument/prepareRename lsp requests. currently only works for LetVar and Local identifiers, as i was unsure of how to properly handle Global identifiers with regards to (require ...) and those two were the easiest to implement by far.

on an unrelated side note: i noticed that (map ...) is not marked as a builtin, which i think it should be?. not sure how to fix that though.

@mattwparas
Copy link
Owner

map is defined in stdlib.scm - so I don't think it will get immediately registered as a builtin from the semantic analysis perspective. Things in that file probably should be specialized though, so that we can do some more work with it easily

@m4rch3n1ng m4rch3n1ng force-pushed the lsp-rename branch 2 times, most recently from 0706c1a to 70f7f31 Compare March 16, 2025 22:40
@m4rch3n1ng
Copy link
Contributor Author

m4rch3n1ng commented Mar 17, 2025

this does not quite work, as without lowering (which is not done for the ast_map), the following code produces 4 different syntax object ids for the same

(let countdown ((i 10))
  (if (= i 0) 'liftoff
      (begin
        (display i)
        (newline)
        (countdown (- i 1)))))

doing engine.emit_expanded_ast_without_optimizations(script, None) and then feeding that into SemanticAnalysis::new() gives me this output:

full reproducing code

let script = r#"(let countdown ((i 10))
  (if (= i 0) 'liftoff
      (begin
        (display i)
        (newline)
        (countdown (- i 1)))))
"#;

let mut engine = Engine::new();
let mut exprs = engine
    .emit_expanded_ast_without_optimizations(script, None)
    .unwrap();
let analysis = SemanticAnalysis::new(&mut exprs);

let identifiers = analysis.analysis.identifier_info();
println!("{:#?}", identifiers);
{
    SyntaxObjectId(
        237240,
    ): SemanticInformation {
        kind: Local,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 1,
        span: 5..14,
        refers_to: Some(
            SyntaxObjectId(
                237211,
            ),
        ),
        aliases_to: None,
        builtin: false,
        last_usage: true,
        stack_offset: Some(
            0,
        ),
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    // -- snip --
    SyntaxObjectId(
        237242,
    ): SemanticInformation {
        kind: LetVar,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 3,
        span: 5..14,
        refers_to: None,
        aliases_to: None,
        builtin: false,
        last_usage: false,
        stack_offset: None,
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    // -- snip --
    SyntaxObjectId(
        237215,
    ): SemanticInformation {
        kind: Local,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 1,
        span: 5..14,
        refers_to: Some(
            SyntaxObjectId(
                237242,
            ),
        ),
        aliases_to: None,
        builtin: false,
        last_usage: true,
        stack_offset: Some(
            1,
        ),
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    // -- snip --
    SyntaxObjectId(
        237211,
    ): SemanticInformation {
        kind: LetVar,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 1,
        span: 5..14,
        refers_to: None,
        aliases_to: None,
        builtin: false,
        last_usage: false,
        stack_offset: None,
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
}

as you can see there are multiple different identifiers with multiple different syntax objects with different kinds for the span 5..14: they are is 237240, which is a LetVar, 237242, a Local, 237215, a Local and 237211, another LetVar, and two of them refer_to a different of the other ones. if you look at the full output (which i have ommitted here because it is quite large) you can see that the one that is probably the "correct" one is 237242, because there are a few other items that refer_to that one. for all the other nothing outside of these 4 refer to them.

if i replace the engine.emit_expanded_ast_without_optimizations with a simple Parser::parse(), the resulting identifiers look a lot more like what i want:

{
    SyntaxObjectId(
        691,
    ): SemanticInformation {
        kind: Local,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 1,
        span: 5..14,
        refers_to: Some(
            SyntaxObjectId(
                747,
            ),
        ),
        aliases_to: None,
        builtin: false,
        last_usage: true,
        stack_offset: None,
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    SyntaxObjectId(
        747,
    ): SemanticInformation {
        kind: LocallyDefinedFunction,
        set_bang: false,
        depth: 2,
        shadows: None,
        usage_count: 2,
        span: 5..14,
        refers_to: None,
        aliases_to: None,
        builtin: false,
        last_usage: false,
        stack_offset: None,
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    // -- snip --
    SyntaxObjectId(
        737,
    ): SemanticInformation {
        kind: LocallyDefinedFunction,
        set_bang: false,
        depth: 3,
        shadows: None,
        usage_count: 1,
        span: 107..116,
        refers_to: Some(
            SyntaxObjectId(
                747,
            ),
        ),
        aliases_to: None,
        builtin: false,
        last_usage: false,
        stack_offset: None,
        escapes: false,
        capture_index: None,
        read_capture_offset: None,
        captured_from_enclosing: false,
        heap_offset: None,
        read_heap_offset: None,
        is_shadowed: false,
        is_required_identifier: false,
    },
    // -- snip --

here i can just resolve the identifier to the LocallyDefinedFunction, and then filter for all identifiers that resolve to the same identifier and are also a LocallyDefinedFunction and it works as expected.

i pushed a temporary proof-of-concept commit as d6cb6ea that does just exactly that, but that is less than ideal because i have to recompute the map on every call of rename and i have to (probably) reallocate the slice, because it is stored as a rope. with that commit applied, everything (at least in my kind of limited testing) works for me as expected.

as i do not fully understand the code base (sorry if i did this pr a little rash), is there a reason the ast_map is parsed without lowering?

@mattwparas
Copy link
Owner

Sorry - been a busy week! To answer your immediate question, the reason it is parsed without lowering is to handle macros, generally speaking. Otherwise - I don't remember the specific reason that the LSP parses without lowering here. I will take a more involved look and try to provide some more clarify

@mattwparas
Copy link
Owner

mattwparas commented Apr 19, 2025

Okay - I've come back to this, sorry for the delay:

The emit_expanded_ast does do all of the proper lowering. Taking your example provided, here is a snippet of the AST:

[Let(Let { bindings: [(Atom(Atom { syn: RawSyntaxObject { ty: Identifier(3907), span: 5..14 } }), Atom(Atom { syn: RawSyntaxObject { ty: Number(Real(Int(Small(123)))), span: 0..0 } }))], body_expr: Let(Let 
{ bindings: [(Atom(Atom { syn: RawSyntaxObject { ty: Identifier(3909), span: 5..14 } }), List(List { args: [Atom(Atom { syn: 
RawSyntaxObject { ty: Identifier(492), span: 0..0 } }), Atom(Atom { syn: RawSyntaxObject { ty: Identifier(3907), span: 5..14 } })], syntax_object_id: 245242, improper: false, location: 0..0 }))], body_expr: Let(Let 
{ bindings: [(Atom(Atom { syn: RawSyntaxObject { ty: Identifier(3908), span: 0..0 } }), LambdaFunction(LambdaFunction {
...

let gets expanded, which ends up with a few different things sharing the span for it since they were created during the expansion process. I'm investigating the analysis output to see if I can make some sense of it.

@mattwparas
Copy link
Owner

So this is why the analysis looks funny. This is the fully expanded code:

(%plain-let ((##countdown3 123)) ;; dummy value that should probably be something like 'unintialized
  (%plain-let ((####countdown33 (#%box ##countdown3)))
    (%plain-let ((##_____countdown04 (λ (##i4)
          (if (= ##i4 0)
            (quote
              liftoff)
            (begin
             (display ##i4)
                  (newline)
                  ((#%unbox ####countdown33)
                     (- ##i4 1)))))))
      (begin
       (#%set-box! ####countdown33 ##_____countdown04)
            ((#%unbox ####countdown33) 10)))))

@mattwparas
Copy link
Owner

mattwparas commented Apr 19, 2025

So, after looking at this, I actually think your solution makes the most sense. Given that renaming should really only apply "source code level" stuff, re-parsing and attempting to resolve stuff pre macro expansion makes a lot of sense. Given that, I'm happy to approve and live on it for a bit to see how it plays.

Were you happy with this as is, or did you want to do any more on it?

mattwparas
mattwparas previously approved these changes Apr 19, 2025
@m4rch3n1ng
Copy link
Contributor Author

i was not quite happy with the commit marked with [tmp] (5429f8e) as that cloned the string out of the rope and re-evaluated the ast every time a rename was made or prepared and i wanted to that in the same place where you were already emit_expanded_ast to avoid cloning and duplicating this code.

i will have to get back to this later though, as i am very tired and need to wake up early tomorrow (though i'll probably deal with the morge conflict in #338 before that)

@mattwparas
Copy link
Owner

No rush at all 😃

I'll hold off on merging then

@mattwparas mattwparas dismissed their stale review April 19, 2025 22:25

Premature approval

@m4rch3n1ng
Copy link
Contributor Author

did that now: i added a new lowered_ast dashmap, that get's initialized. i didn't do any error reporting, as i think that should have been done when getting the ast for the normal ast_map. i haven't done the most extensive testing for this version, but i don't think i broke it at least.

@mattwparas
Copy link
Owner

Thanks! Lets get this in so it can start getting some use

@mattwparas mattwparas merged commit 94b7888 into mattwparas:master Apr 23, 2025
9 checks passed
@m4rch3n1ng m4rch3n1ng deleted the lsp-rename branch April 23, 2025 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants