Skip to content

Conversation

@CPunisher
Copy link
Member

@CPunisher CPunisher commented Oct 19, 2025

Background:

In #10377 and #10399, we create another crate swc_ecma_lexer from swc_ecma_parser.

  1. The main goal is that we want to split TokenKind and TokenValue to make lexer and parser run faster. This has been done in swc_ecma_parser, whose Token is only 1 byte. This also means we should refactor the lexer and the parser.
  2. SWC always made Lexer and Token public, so the change of Token will introduce a large breaking for rust users. So for compatibility, we have to keep the legacy set of lexer and parser to produce compatible Tokens.
  3. However, I made a wrong decision in refactor(es/parser): Split parser into also-lex/parse-only #10399. That is, we separate the common lexer/parser functions as much as possible in the swc_ecma_lexer/common by introducing a set of complexy ParseTrait, LexerTrait, etc, which makes the project chaotic and less comprehensive. You can see that swc_ecma_parser depends on swc_ecma_lexer and calls the common functions everywhere.

Motivation:

  1. Apparently it makes DX better. You don't need to jump to empty trait functions before jump to their implementations. You don't need to jump cross two crates forward and back.
  2. It also improve the space for performance optimization because you won't be restricted by the trait or the legacy lexer and parser.

Description:

Now it's time to correct the decision. This pr makes swc_ecma_parser self-contained and doesn't depends swc_ecma_lexer any more. On the contrary, this pr makes swc_ecma_lexer depends on some swc_ecma_parser instead such as some common simple data structure like Syntax. For compatibility, I also move and import legacy Token in swc_ecma_parser.

After this pr the swc_ecma_lexer is nearly marked as no longer maintained. All the bug fixes and performance optimization should only be applied in swc_ecma_parser.

Specifically, what I do in this pr is only copy all common functions from swc_ecma_lexer/common to swc_ecma_parser and eliminate the trait-based generics. For example:

// Before
// crates/swc_ecma_lexer/common/...
pub trait Parser {
   fn xxx();
   fn yyy() { ... }
}

pub fn parse_xx<P: Parser>(p: &mut P) { ... }

// Impl for both legacy Parser and new performant Parser
impl common::Parser for Parser { ... }
// After
// crates/swc_ecma_parser/...
impl Parser {
   pub fn parse_xx(&mut self) { // Copy and paste the code }
}

Note that I nearly doesn't change anything in swc_ecma_lexer so the lexer and parser in that crate are still based on trait and common function.

Breaking Changes:

If you don't use Token API, then there's no breaking changes, which means for most rust api users, there's no breaking change. Otherwise you may need to remove the dependencies of swc_ecma_lexer and related imports of traits.

Test in community crates:

Future Works:

Actually when I finished copying all the code, the performance got regression. It takes me lots of time to figure it out but I finally keep the regression around -1%. So I have to do some complex optimization ahead of time such as refactor of parse_subscripts. I think it's better to merge the optimization in other PRs. So after this pr is ready I will split the that off.

@changeset-bot
Copy link

changeset-bot bot commented Oct 19, 2025

⚠️ No Changeset found

Latest commit: 20f5d1d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@CPunisher CPunisher force-pushed the 10-11-refactor/parser branch from 8aef46c to d856923 Compare October 19, 2025 03:27
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 19, 2025

CodSpeed Performance Report

Merging #11148 will degrade performances by 5.32%

Comparing CPunisher:10-11-refactor/parser (20f5d1d) with main (146c77c)

Summary

❌ 1 regression
✅ 110 untouched
🆕 11 new
⏩ 29 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
🆕 es/lexer/angular N/A 7.7 ms N/A
🆕 es/lexer/backbone N/A 1 ms N/A
🆕 es/lexer/cal-com N/A 13.5 ms N/A
🆕 es/lexer/colors N/A 28.7 µs N/A
🆕 es/lexer/jquery N/A 5.6 ms N/A
🆕 es/lexer/jquery mobile N/A 8.7 ms N/A
🆕 es/lexer/mootools N/A 4.4 ms N/A
🆕 es/lexer/three N/A 20.2 ms N/A
🆕 es/lexer/typescript N/A 110.4 ms N/A
🆕 es/lexer/underscore N/A 883.8 µs N/A
🆕 es/lexer/yui N/A 4.7 ms N/A
es2020_nullish_coalescing 298.8 µs 315.6 µs -5.32%

Footnotes

  1. 29 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@socket-security

This comment was marked as spam.

@CPunisher CPunisher force-pushed the 10-11-refactor/parser branch from 809252f to 0420eb6 Compare October 19, 2025 15:05
@CPunisher CPunisher force-pushed the 10-11-refactor/parser branch from 784b3e9 to 9e34964 Compare October 21, 2025 03:49
@CPunisher CPunisher force-pushed the 10-11-refactor/parser branch from cd20ed2 to d3b79fc Compare October 21, 2025 10:22
@CPunisher CPunisher force-pushed the 10-11-refactor/parser branch from d3b79fc to 236f45e Compare October 21, 2025 10:45
@socket-security
Copy link

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​babel/​traverse@​7.22.1100257995100
Added@​types/​terser@​3.12.0661003977100
Addedhas@​1.0.3671006252100
Added@​swc/​counter@​0.1.31001004277100
Addednode-releases@​2.0.131001004490100
Addedhasown@​2.0.2671007253100
Addedpath-parse@​1.0.7671007952100
Addedis-core-module@​2.12.1671008253100
Addedfunction-bind@​1.1.2671008353100
Added@​swc/​plugin-jest@​1.5.117801005594100
Added@​types/​parse-json@​4.0.01001005878100
Addedis-arrayish@​0.2.11001005881100
Addedaxios@​0.21.4996010095100
Added@​babel/​plugin-transform-react-jsx-development@​7.18.61001006189100
Updatedjest-regex-util@​25.2.6 ⏵ 29.6.31001006286100
Updatedjest-get-type@​25.2.6 ⏵ 29.6.310010063 +184100
Updatedescape-string-regexp@​2.0.0 ⏵ 1.0.510010063 -577100
Updated@​jest/​globals@​25.5.2 ⏵ 29.7.010010063 +195100
Added@​babel/​plugin-transform-dotall-regex@​7.18.61001006389100
Added@​babel/​plugin-transform-unicode-regex@​7.18.61001006389100
Updated@​jest/​source-map@​25.5.0 ⏵ 29.6.31001006488100
Added@​babel/​plugin-transform-unicode-sets-regex@​7.22.31001006589100
Added@​babel/​plugin-transform-exponentiation-operator@​7.18.61001006589100
Added@​babel/​plugin-transform-reserved-words@​7.18.61001006589100
Added@​babel/​plugin-transform-class-properties@​7.22.31001006589100
Added@​babel/​plugin-transform-private-methods@​7.22.31001006589100
Added@​babel/​plugin-transform-sticky-regex@​7.18.61001006589100
Updatedcore-util-is@​1.0.2 ⏵ 1.0.310010065 -177100
Addedcollect-v8-coverage@​1.0.11001006685100
Added@​babel/​plugin-transform-named-capturing-groups-regex@​7.22.31001006689100
Updatedjest-resolve-dependencies@​25.5.4 ⏵ 29.7.0100 +110066 +196100
Added@​babel/​plugin-transform-property-literals@​7.18.61001006689100
Added@​babel/​plugin-transform-optional-catch-binding@​7.22.31001006689100
See 381 more rows in the dashboard

View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant