Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect lexing and parsing of multi-byte comments #708

Closed
wants to merge 2 commits into from

Conversation

saranrapjs
Copy link

We recently ran into a bug where including the unicode em-dash character () lead to GraphQL queries with that character in the comments to fail to be parsed. Digging into the issue further, it looks like some of the existing handling for lexing comment strings as whitespace fail to account for multi-byte comments in how they count byte length of commented unicode characters.

This PR fixes the bug (by including the byte length of commented characters in the "runePosition" count that's used to fast forward whitespace), and adds a couple of tests in the lexer and parser packages which should illustrate the bug as it stands on main.

@coveralls
Copy link

Coverage Status

coverage: 92.048%. remained the same
when pulling bd7da58 on nytimes:fix-multi-byte-comments
into a546af7 on graphql-go:master.

@saranrapjs
Copy link
Author

I'm realizing this addresses the same bug as this PR: #605

@saranrapjs
Copy link
Author

...I'm going to close this in favor of the other PR, which also addresses multi-byte character sequences in non-comment ranges (e.g. a BOM character) in addition to the comments bug. It also passes the tests I'd added here.

@saranrapjs saranrapjs closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants