-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-generate AST boilerplate #15544
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for doing this!
Unrelated to this PR, Biome does something similar (https://github.com/biomejs/biome/tree/main/xtask/codegen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is great.
I think my long term preference would be to use something like ungrammar that would also allow us to e.g. generate the source order visitor but I think this is a great start.
crates/ruff_python_ast/generate.py
Outdated
root = check_output(["git", "rev-parse", "--show-toplevel"], text=True).strip() | ||
ast_path = Path(root).joinpath("crates", "ruff_python_ast", "ast.toml") | ||
out_path = Path(root).joinpath("crates", "ruff_python_ast", "src", "generated.rs") | ||
with ast_path.open("rb") as ast_file: | ||
ast = tomllib.load(ast_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the code a bit hard to read because it mixes top level statement with class and declarations. I'd prefer to instead have a main
function that's called in if __name__ == "__main__"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
crates/ruff_python_ast/generate.py
Outdated
out = [""" | ||
// This is a generated file. Don't modify it by hand! | ||
// Run `crates/ruff_python_ast/generate.py` to re-generate the file. | ||
"""] # fmt: skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use one of:
out = [
textwrap.dedent("""
// This is a generated file. Don't modify it by hand!
// Run `crates/ruff_python_ast/generate.py` to re-generate the file.
""")
]
out = [
textwrap.dedent(
"""
// This is a generated file. Don't modify it by hand!
// Run `crates/ruff_python_ast/generate.py` to re-generate the file.
"""
)
]
Or use the auto formatted code. IMO, it's roughly equally esthetic as the manually formatted array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moot as part of reorganizing into functions.
crates/ruff_python_ast/generate.py
Outdated
# ------------------------------------------------------------------------------ | ||
# Owned enum | ||
# | ||
# For each group, we create an enum that contains an owned copy of a syntax | ||
# node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think each of those blocks could be functions instead where the inline comment becomes a docstring.
It would also be nice if the docstring could contain. code snippet of what it generates:
Generates the owned enum for each group, e.g Stmt
, Expr
etc. For each, generate:
- The enum
- Implement
From<Variant>
- Implement
Ranged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// See also [stmt](https://docs.python.org/3/library/ast.html#ast.stmt) | ||
#[derive(Clone, Debug, PartialEq, is_macro::Is)] | ||
pub enum Stmt { | ||
#[is(name = "function_def_stmt")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR but I think it would be nice if we generated the is
implementations as well. The is
macro doesn't always work well with ide's and macros are also more expensive to compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Python nitpicks ;)
Co-authored-by: Micha Reiser <[email protected]> Co-authored-by: Alex Waygood <[email protected]>
.pre-commit-config.yaml
Outdated
--extend-select, | ||
I, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to put this config in the tool.ruff
section of our pyproject.toml
file at the directory root?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would! Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth considering moving the ruff config in the scripts dir into the root dir pyproject.toml
. Isort and some other common rules are already enabled there, so it may be useful to enable them for the whole project if scripts are being added outwith the scripts dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I agree with @calumy's comment at #15544 (comment), but that could be done as a followup!
* main: [red-knot] Inline `SubclassOfType::as_instance_type_of_metaclass()` (#15556) [`flake8-comprehensions`] strip parentheses around generators in `unnecessary-generator-set` (`C401`) (#15553) [`pylint`] Implement `redefined-slots-in-subclass` (`W0244`) (#9640) [`flake8-bugbear`] Do not raise error if keyword argument is present and target-python version is less or equals than 3.9 (`B903`) (#15549) [red-knot] `type[T]` is disjoint from `type[S]` if the metaclass of `T` is disjoint from the metaclass of `S` (#15547) [red-knot] Pure instance variables declared in class body (#15515) Update snapshots of #15507 with new annotated snipetts rendering (#15546) [`pylint`] Do not report methods with only one `EM101`-compatible `raise` (`PLR6301`) (#15507) Fix unstable f-string formatting for expressions containing a trailing comma (#15545) Support `knot.toml` files in project discovery (#15505) Add support for configuring knot in `pyproject.toml` files (#15493) Fix bracket spacing for single-element tuples in f-string expressions (#15537) [`flake8-simplify`] Do not emit diagnostics for expressions inside string type annotations (`SIM222`, `SIM223`) (#15405) [`flake8-pytest-style`] Do not emit diagnostics for empty `for` loops (`PT012`, `PT031`) (#15542)
This PR replaces most of the hard-coded AST definitions with a generation script, similar to what happens in
rust_python_formatter
. I've replaced every "rote" definition that I could find, where the content is entirely boilerplate and only depends on what syntax nodes there are and which groups they belong to.This is a pretty massive diff, but it's entirely a refactoring. It should make absolutely no changes to the API or implementation. In particular, this required adding some configuration knobs that let us override default auto-generated names where they don't line up with types that we created previously by hand.
Test plan
There should be no changes outside of the
rust_python_ast
crate, which verifies that there were no API changes as a result of the auto-generation. Aggressivecargo clippy
anduvx pre-commit
runs after each commit in the branch.