-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhanced AST #44
Comments
I also want "enhanced AST". Although my desire are pretty different from yours. I want it to be immutable, so that nodes can be compared / hashed. But this means I want to destroy line information etc. on the nodes themselves. This would also allow nodes to be reused without deep copying them . Source maps could be maintained for nodes based on identity but this would run into issue if nodes are used in multiple places without deep copying. I want a version independent AST, so passes are more portable. My vision for this would basically be what ever the bleeding edge AST is, with older versions only allowing a subset of nodes. Perhaps the most notable recent change that I want to back port is unified constant nodes, instead having different nodes for every type of literal. Finally I want to a distinct node for Also persevering and parsing comments would be nice but is lower priority for me than 3 concerns above. |
Was just chatting with @willcrichton want to move this a discussion here and get @leonardt opinion. Debian has moved to python 3.8 which breaks ast_tools so I have more motivation to either make my own alt ast or migrate to an existing one. Will uses LibCST and plans on continuing to use it. However, it has the downside of not supporting returning Right now am I leaning toward moving to @leonardt are there features in AST that you don't have in the standard module which you would like for silica and fault? We should do a review of our requirements before a major rewrite. |
Right now magma is using the pass infrastructure (begin/end rewrite). It would be nice if we could do some brainstorming to see if we could come up with a slightly more ergonomic interface (at some point I think we liked the idea of Other than that there's no real immediate feature requirements from me, although I'm sure any useful/reusable code would help clean up the internals. This PR has a good overview of the new comb/seq flow for the AST (a lot simpler now) https://github.com/phanrahan/magma/pull/668/files |
Okay I think I am going to start working on this. I think for now I am going to create |
Actually it turns out that making an For now there will be separate |
I think we can close this now that we've moved to libcst? |
agreed |
For my inlining tool, there's a number of features I need that aren't present in the standard AST module. I'm thinking about the design of an "enhanced" AST that enables these features, and I'd like to know what use cases y'all have as well.
Requirements
Preserve comments and whitespace: when e.g. inlining a function, because I'm generating human-readable code, it's useful to preserve comments and whitespace for low-level legibility and higher-level comprehension.
ast.parse
drops this information after tokenization.Generate string from AST, and preserve line number mapping to AST nodes: to test whether a line of code is executed in a program, I turn the AST into a string. Then I execute the program using
Frame.f_trace_lines
to track executed lines. Then, I need to map this information back to an AST node. The only way to do that currently is to re-generate the AST from the created source file. However, this throws away any in-memory information attached to the AST, unless it somehow is explicitly dumped into the generated Python source. Ideally, the AST would not be re-generated, and instead the AST-to-string routine can remember the AST node <-> line number mapping.Annotate AST nodes. It would be useful to annotate AST nodes with information like a source map, or a history of transformations applied to it. This is possible right now by just attaching members to the AST node object, although that usually messes with something, e.g. structural AST comparisons.
Prior work
There are two main projects that have a subset of these features:
baron
: this is a seemingly popular tool designed for source-preserving refactoring. It completely circumvents the CPython AST facilities, having its own AST, grammar, visitor framework, so on. It was specifically designed such thatto_string(parse(code)) == code
, i.e. 1-to-1 source mapping. However, we probably wouldn't want to tie ourselves to an AST framework that won't interoperate with all other AST tools.horast
: this is a small tool that just preserves comments in the AST (not whitespace). It's a lightweight extension on-top of CPython's ast library. It seems like a good reference implementation for comment-preservation even if we don't literally use the library.The text was updated successfully, but these errors were encountered: