-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syntax: accept {,n}
as an equivalent to {0,n}
#1086
Conversation
Most regular expression engines don't accept the `{,n}` syntax, but some other do it (namely Python's `re` library). This introduces a new parser configuration option that enables the `{,n}` syntax.
Out of curiosity, is this the only additional syntax you need to support and everything else is in sync? I would be surprised if so. I ask this because if this is literally the only thing, then I can see how "maintain a fork of Another possible route here is to just unconditionally enable this and support it in |
Actually there are a few other incompatibilities, but either they are less prevalent in real-life regexps, or they are actually benefitial in the long term. One case is the support for the unscaped In the case of Maybe we can simply hold on this change until I advance a little more. If I eventually need some changes like this one, then I must go through the fork path, if not, I can retake this PR and even provide the full implementation |
Yeah I mean I would argue that the benefit here is that I like the idea of holding off until you advance a bit more. If this turns out to literally be the only thing you need then I might be able to get on board with that. |
Now `RepetitionCountUnclosed` error has precedence over `RepetitionCountDecimalEmpty`, even if the `{,n}` syntax is not accepted. I think this is even desirable, as `RepetitionCountUnclosed` describes the situation better when an unclosed `{` is found in the regex.
@BurntSushi I would like to rescue this PR for your consideration again. After using This PR adds support for A side-effect is that some errors change from |
@BurntSushi any comments on this? |
Hi @BurntSushi, I would like to publish the crate I was working on, which depends on a modified version of If you are ok with merging this change that would save my day, if not, I think next my best option is including Not being allowed to publish a crate that depends on some public GitHub repository (even if not published in crates.io) was quite unexpected too me :( |
Yeah, sorry about my absence here. I've been focused on other things. I think my main decision point here is whether I just want to enable it for everyone (so don't make it conditional), or to keep it as you have here via an option. I do kind of feel like overall the syntax |
This is on crates.io in |
That was quick! Thank you so much. |
Now that rust-lang/regex#1086 was merged, I can stop depending on my own clone and can rely on the official implementation of `regex-syntax`.
Most regular expression engines don't accept the
{,n}
syntax, but some other do it (namely Python'sre
library). This introduces a new parser configuration option that enables the{,n}
syntax as equivalent to{0,n}
. This option is disabled by default and not exposed in theregex
create.I understand this change may imply a deviation from the goals of the
regex-syntax
create. If the purpose of theregex-syntax
crate is exposing the parser used by theregex
create and supporting a very specific regex syntax flavor, then this PR can be closed. However, I'm bringing this PR into consideration because I'm usingregex-syntax
in project that requires support for the{,n}
syntax for backward-compatibility with existing regular expressions. The alternative for me is maintaining a fork ofregex-syntax
forever.Also, I think
empty_min_range
is not a very good name, but I couldn't come up with a better name. Better naming alternatives are welcomed, even if this PR is not finally merged.