Skip to content

Feature request: re_take_until #3

Open
@daboross

Description

@daboross

This is a continuation of rust-bakery/nom#709, since regex nom functions have since moved into this crate.

I have a specific use case: I'd like to parse "[thing1, thing2part1,part2, thing3]" into ["thing1", "thing2part1,part2", "thing3"].

If I could guarantee no item-internal commas, I could implement this as:

delimited(
    tag("["),
    separated_list0(tag(", "), take_while(|c| ![',', ']'].contains(&c))),
    tag("]"),
)

However, since I want to parse a multi-letter sequence, this won't work.

I could do something with take_till, and probably will for this case, but I think that ends up pretty hacky.

On the other hand, regex engines are optimized for this kind of matching, so I believe using regex (or anything aho_corasick-based) is the best option.

I'd propose re_take_until, matching the interface of complete::take_until except taking in a regex rather than byte/string directly.

It'd be used like this to match the above pattern:

let closing_bracket_or_comma_space = regex::Regex::new(r"]|, ").unwrap();
delimited(
    tag("["),
    separated_list0(tag(", "), re_take_until(closing_bracket_or_comma_space),
    tag("]"),
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions