[v2] add version validation for structured CST by OmarTawfik · Pull Request #1627 · NomicFoundation/slang

OmarTawfik · 2026-04-07T14:19:03Z

No description provided.

changeset-bot · 2026-04-07T14:19:10Z

⚠️ No Changeset found

Latest commit: 840c163

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

teofr

Just one small thing, and two questions

crates/codegen-v2/cst/src/structured_cst/versioned_descendants.rs

crates/solidity-v2/outputs/cargo/cst/src/structured_cst/text_range/mod.rs

...olidity-v2/outputs/cargo/cst/src/structured_cst/validation/validate_syntax_version.rs.jinja2

ggiraldez · 2026-04-08T20:28:31Z

crates/solidity-v2/outputs/cargo/tests/src/cst/cst_output/runner.rs

+        // TODO(v2): these tests should really go through 'CompilationUnit' once it is ready.
+        // This way, we won't have to call individual validation APIs.
+        // All errors should be collected during the compilation unit construction.


As of right now, the CompilationUnit will also perform semantic analysis on the parsed source, which is something we absolutely don't need for these snapshots. But it will also do other things not related to parsing, eg. resolution of imported paths. I think we may need an intermediate abstraction that both these tests and the CompilationUnit can use.

I think we may need an intermediate abstraction that both these tests and the CompilationUnit can use.

And the benchmarks

Agreed. Once your PRs are merged, I was thinking of adding two levels of operations (at least for now):

Syntax: that includes parsing them, and doing any syntax-only validation (per file), like parse errors, versioning, pragma, import resolution. All of this doesn't need to build the AST or run the nano-passes yet.

Semantic: that is running the nano-passes, and collecting compilation-wide diagnostics.

I will also think about how to combine/report validation errors across the board using a standard type/set of utils to serialize/render them. Please let me know if you have any suggestions in the meantime.

ggiraldez

Left one comment but looks good to me.

My only other concern is the text_range() functions returning an Option<> which intuitively doesn't feel right. I understand the reasons why, but from a user perspective I'd assume it returns an empty range, but located at the offset where the node would be.

OmarTawfik · 2026-04-09T11:03:31Z

crates/solidity-v2/outputs/cargo/cst/src/structured_cst/text_range/mod.rs

@@ -0,0 +1,14 @@
+#[path = "text_start.generated.rs"]


My only other concern is the text_range() functions returning an Option<> which intuitively doesn't feel right. I understand the reasons why, but from a user perspective I'd assume it returns an empty range, but located at the offset where the node would be.

AFAIU, we are not exposing structured_cst or its utils to the user at all, so this is only internal. The call-sites are never expected to call it on an empty node (they already .expect() it).

If these assumptions ever change, I think we would need to introduce a cursor/stateful visitor to keep track of ancestry/outer ranges as well. But given that most future validation/operations will happen on the AST, I'm not sure if it is worth adding one now.

@ggiraldez Thoughts?

You're right we're not exposing the CST to the user. I was thinking how can we use this information to translate it to the AST layer and then provide it to the user. But I agree it can be done in the IR builder by adding a bit of state.

In any case, the only nodes that can potentially be empty are the collection non-terminals, right? And I guess, transitively any other non-terminal that contains a single collection, ie. a choice or another collection. Am I missing any other case?

There's a cleanish solution to this I was considering, but I'm not fully convinced it doesn't bring up other issues.

We could get rid of allow_empty in collections, and instead put that responsibility in the parent by using Optional(...) instead of Required(...). Then every collection is non-empty by definition.

It's not necessary, but it came up while solving #1654

From the IR/AST point of view, it does make the data structures a tiny bit more cumbersome to work with (ie. needing to unwrap the Option). But it maybe something we can solve when building the IR.

In any case, the only nodes that can potentially be empty are the collection non-terminals, right?

Yes. This would be None in the case of empty collections. And in that case we would never have an offending syntax to try to get the range for.

We could get rid of allow_empty in collections, and instead put that responsibility in the parent by using Optional(...) instead of Required(...). Then every collection is non-empty by definition.

it does make the data structures a tiny bit more cumbersome to work with

This is the reason it is enforced via through Errors::OptionalFieldAllowsEmpty, as it was much easier to deal with in all subsequent APIs.

I think it should be trivial to get complete ranges with a bit of state and some extra processing, but it is not needed for the CST so far.

teofr

👍

OmarTawfik force-pushed the OmarTawfik/validate-syntax-version branch from 4b69995 to 45a5e88 Compare April 7, 2026 14:29

OmarTawfik marked this pull request as ready for review April 7, 2026 14:47

OmarTawfik requested review from a team as code owners April 7, 2026 14:47

teofr reviewed Apr 8, 2026

View reviewed changes

ggiraldez reviewed Apr 8, 2026

View reviewed changes

ggiraldez approved these changes Apr 8, 2026

View reviewed changes

[v2] add version validation for structured CST

620a1b2

OmarTawfik commented Apr 9, 2026

View reviewed changes

OmarTawfik mentioned this pull request Apr 9, 2026

[v2] make sure structured_cst includes all relevant terminals #1654

Open

review comments

840c163

OmarTawfik force-pushed the OmarTawfik/validate-syntax-version branch from 45a5e88 to 840c163 Compare April 9, 2026 11:26

OmarTawfik enabled auto-merge April 9, 2026 11:26

OmarTawfik disabled auto-merge April 9, 2026 11:27

OmarTawfik requested a review from teofr April 9, 2026 11:27

teofr approved these changes Apr 9, 2026

View reviewed changes

OmarTawfik added this pull request to the merge queue Apr 9, 2026

Merged via the queue into main with commit 91f575d Apr 9, 2026
16 of 18 checks passed

OmarTawfik deleted the OmarTawfik/validate-syntax-version branch April 9, 2026 12:24

Conversation

OmarTawfik commented Apr 7, 2026

Uh oh!

changeset-bot bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

teofr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggiraldez Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

teofr Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

OmarTawfik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

ggiraldez left a comment

Choose a reason for hiding this comment

Uh oh!

OmarTawfik Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggiraldez Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

teofr Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

ggiraldez Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

OmarTawfik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

teofr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changeset-bot bot commented Apr 7, 2026 •

edited

Loading

OmarTawfik Apr 9, 2026 •

edited

Loading