All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- Introduced new crate feature
regex_automata
to configurescnr
crate to use an alternative regex engine.
- Fix for #558
This fix updates the
last_consumed_token_end_pos
inTokenStream::take_skip_tokens
on skip tokens, too
- Fixed a subtle bug related to token buffer handling after scanner mode switching near the end of the input
- Using new
scnr
version 0.7.0 due to increased performance
Please note, that changes made in version 2 are also detailed in an extra chapter of the book.
-
Integration of scanner crate
scnr
.parol_runtime
now uses this crate as scanner crate instead of regexes created with the help of theregex-automata
crate. I hope that this way we can better fulfill the specific needs in context of tokenization. On the other hand I'm aware that we will surely loose some comfort. I'm curious how things will work out here. Please give feedback on any problems you encounter. -
The changes coming with this switch to
scnr
lead to a lot of changes internally and in the public interface as well as in the behavior of generated parsers from the perspective of tokenization. Thus the bump in the major version. Especially regarding these differences in behavior please have a look at thescnr
's README. -
<UserType>GrammarTrait::on_comment_parsed
is renamed to<UserType>GrammarTrait::on_comment
for clarity. -
Support for vanilla mode has been discontinued. The related feature
auto_generation
became pointless and has therefore been removed. -
The version 1 will be supported and updated regularly on branch
release1.0
, so you aren't forced to switch to version 2 any time soon. -
Error recovery on generated LL(k) parsers can now be disabled.
- Fixed clippy warnings new in Rust 1.80.0
The version 1 is maintained on branch
release1.0
. All changes for this version therefore are only visible on this branch. This includes this change log too.
- Fix issue #357
- LR parser: Outputting current scanner in error message
- Fixed problem with deterministic and termination of the recovery process in generated LL parsers
- Add new errors related to recovery, thus minor version bump
- Fix default settings for enabling parse tree generation in
LRParser
- Optimize memory consumption in case parse tree generation is disabled in
LRParser
- Improved load performance of LRParsers by using static array as data
- Provide parse tree generation for LRParser
- Implement scanner-based scanner switching which can be used with LL(k) and LALR(1) parsers
- Public API has changed and fits to
parol
>= 0.29
- New parser type to foster LALR(1) grammar support of
parol
0.28.0
-
Fixed issue #310 Access internal data of TokenVec
I extended the implementation on
TokenVec
. It now provides aget
method and aniter
method.
/// A vector of tokens in a string representation
#[derive(Debug, Default)]
pub struct TokenVec(Vec<String>);
impl TokenVec {
/// Pushes a token to the vector
pub fn push(&mut self, token: String) {
self.0.push(token);
}
/// Returns an iterator over the tokens
pub fn iter(&self) -> std::slice::Iter<String> {
self.0.iter()
}
/// Returns a token at the given index
pub fn get(&self, index: usize) -> Option<&String> {
self.0.get(index)
}
}
- Refactor
parol_runtime::parser::LLKParser::adjust_token_stream
that is used in error recovery
- Improved performance in scanner
- Imposes some BREAKING CHANGES in types
Location
,TokenIter
,TokenStream
andFileSource
- Imposes some BREAKING CHANGES in types
- Removed warnings in generated sources which were issued by
cargo doc
- Please note that this version is incompatible with previous versions and works only with
parol
>= 0.24 - Providing location information on EOI tokens now to support error reporting
- Supports basic error recovery strategies in generated errors
- Token mismatch and production prediction both handles synchronization of input token stream with expected input to enable further parsing
- To minimize the size of tokens the types of some members of
Token
have been changed from usize to u32.- This is a BREAKING CHANGE! Sorry for inconvenience.
- To support the new comment handling feature more generally I added a new member
Token::token_number
which is actually an index. So if you use tokens provided by<UserType>GrammarTrait::on_comment_parsed
you can now determine where exactly the comment token has been scanned in the input relatively to other normal tokens.
- Update crate
regex-automata
to version 0.3.2
- New support for handling of user defined comments (
%line_comment
,%block_comment
)- This library works in conjunction wit
parol
>= 0.22.0 to work properly - The new method
<UserType>GrammarTrait::on_comment_parsed
is called in order of appearance each time before the parser consumes a normal token from token stream. - It is default implemented and the user can provide an own implementation if she is interested in comments.
- This is a minimal support but can greatly improve the usability. Feed is appreciated.
- This library works in conjunction wit
- More efficient implementation of lookahead DFA
- This can also lead to smaller generated parser files up to about 5 percents
- Add new features to support static disabling of log levels during compile time (see issue
#61)
- Thanks to dalance for this proposal
- Exchanged
id_tree
bysyntree
- This includes major API changes that have impact on user code. Please open discussions for migration support
- Filled some missing source documentations
- Fixed issue #58
- ATTENTION ! Incompatible change !
- Removed feature
trim_parse_tree
- Enable trimming of parse tree in build script by calling
trim_parse_tree
on the builder object
- New benches to measure performance of tokenizer
- Using
RegexSet
fromregex-automata
crate as foundation of tokenizing- This will result in major performance boost
- Currently unicode word boundaries are not supported, so one has to use ASCII word boundaries
instead. Simple change occurrences of
\b
to(?-u:\b)
.
- Removed clippy warning
- Removed
miette
as error handling - General improvements of error handling
- Fixed the problem that regex for white spaces consumed newline characters
- Fixed issue #54
- In
TokenStream
the size of the lookahead buffer is always at least 1
- In
- Changed repository reference to the new location
- Otherwise fully compatible with version 0.11.1
- Merged PR #43 from ryo33
- Use \s for WHITESPACE_TOKEN
- Supporting Span information for
parol
's new feature to generate span calculation
- Using
derive_builder
in version 0.12.0 now so that we can use re-export decently.
- Reexporting once_cell now
- Merged PR #2 from ry033. Kudos 👍
- This introduces a new feature "auto_generation" that should be enabled for crates that use
parol
's auto generation mode. If you don't know exactly what this is, please enable this feature! I consider to make it a default feature in future release.
- This introduces a new feature "auto_generation" that should be enabled for crates that use
Token
: Fixed the methodto_owned
and added a methodinto_owned
.
This release introduces breaking changes to the public API. To indicate this we increase minor version number.
- Removed
OwnedToken
type and usedCow
to hold the scanned text inToken
s instead. Anyway this member is private and can only be accessed via methodtext()
. See below for more on this new method. - The
Token
's constructor methodwith
had a change in the type of the text parameter which should be fairly easy to adapt in user code. - The
Token
'sto_owned
method returns aToken
now. - The parsed text of a token can now be accessed via method
text()
of typeToken
now. Formerly you used the membersymbol
directly which is not possible anymore. - Similarly the method to access the token's text via
ParseTree
was renamed fromsymbol()
totext()
in the implementation ofparser::ParseTreeStackEntry
- The types
errors::FileSource
,lexer::Location
andlexer::TokenIter
now internally use aCow<Path>
for holding the file name instead of a more expensiveArc<PathBuf>
. This was originally chosen because of the necessity ofmiette::SourceCode
to beSend + Sync
. But the Cow will do the same with much less effort.- These changes effect user code due to changes in the methods
try_new
oferrors::FileSource
,with
oflexer::Location
andnew
oflexer::TokenIter
- These changes effect user code due to changes in the methods
- Better diagnostics to support parol language server
- Changed display format of
Location
to match vscode's format - Improved traces
- Fixed a bug in TokenStream::push_scanner
- Improved debugging support for error
pop from an empty scanner stack
. - New error type
ParserError::PopOnEmptyScannerStateStack
- Made
ParseType
aCopy
- Using miette 0.5.1 now
- Also updated some other crate references
This version brings rather breaking changes:
- Provide each token with the file name
- Thus the init method could be removed from
UserActionsTrait
. - Factored out the location information form the token types into a separate
Location
struct.
- Add explicit lifetimes in
UserActionsTrait
to aid the use of Token<'t> inparol
's auto-generation feature.
- New test for scanner state switching and the consistence of
miette::NamedSource
which is produced from token stream and token span. TokenStream::ensure_buffer
is called at the end ofTokenStream::consume
to have a more consistent behavior ofTokenStream::all_input_consumed
- Optimized creation of errors::FileSource using the TokenStream
- Referencing
miette ^4.0
now.
- Better formatting of file paths
- Revived
OwnedToken
type for auto-generation feature ofparol
-
As of this version a detailed changelog is maintained to help people to keep track of changes that have been made since last version of
parol_runtime
. -
A new (non-default) feature
trim_parse_tree
was added. The featuretrim_parse_tree
is useful if performance is a goal and the full parse tree is not needed at the end of the parse process. You can activate this feature in your dependencies with this entry```toml parol_runtime = { version = "0.5.5", default-features = false, features = ["trim_parse_tree"] } ``` The parse tree returned from `LLKParser::parse` contains only the root node and is therefore
useless if the feature is activated. Also note that you can't access the children of the nodes provided as parameters of your semantic actions (each of type
&ParseTreeStackEntry
) because they don't have children anymore. Therefore to navigate them will fail.This fixes issue (enhancement) #1