You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
Atomic grouping and possessive quantifiers are extremely useful in regular expression. Supporting them can only speed up regex evaluation, as well as reduce the number of Find and Replace hangs that kill an Atom window.
Consider the regex \b(integer|integral|integrity|insert|int|in)\bmatched against integers. The engine will match the first alternative, fail at the \b, then backtrack into the group and repeatedly match, fail, and backtrack through every single alternative, even matching the group in the last two cases only to fail at the \b again. This is quite a lot of work just to figure out integers isn't in our list of words. On the other hand, the atomic group \b(?>integer|integral|integrity|insert|int|in)\b will immediately give up matching the group at the second \b and never backtrack.
Consider also the regexes "[^"]*" and "[^"]*+" matched against "aaaaaaaaaaaaaaaa. Both fail, but the former backtracks through the entire string, while the second fails immediately thanks to possessive quantifiers. When such behavior is desired (often for order-dependent parameters), this makes a huge difference for performance.
Both of these are simplistic examples, but it's easy to imagine more realistic cases where the patterns getting backtracked through contain multiple alternatives, subgroups, character classes, lookaheads, or all of these.
If, as #557 and #571 suggest, this is because find-and-replace uses JS's regex engine, then it needs to stop. Atom grammars use Oniguruma; so too must find-and-replace use a non-garbage regex engine.
The text was updated successfully, but these errors were encountered:
(migrated issue)
Atomic grouping and possessive quantifiers are extremely useful in regular expression. Supporting them can only speed up regex evaluation, as well as reduce the number of Find and Replace hangs that kill an Atom window.
Consider the regex
\b(integer|integral|integrity|insert|int|in)\b
matched againstintegers
. The engine will match the first alternative, fail at the\b
, then backtrack into the group and repeatedly match, fail, and backtrack through every single alternative, even matching the group in the last two cases only to fail at the\b
again. This is quite a lot of work just to figure outintegers
isn't in our list of words. On the other hand, the atomic group\b(?>integer|integral|integrity|insert|int|in)\b
will immediately give up matching the group at the second\b
and never backtrack.Consider also the regexes
"[^"]*"
and"[^"]*+"
matched against"aaaaaaaaaaaaaaaa
. Both fail, but the former backtracks through the entire string, while the second fails immediately thanks to possessive quantifiers. When such behavior is desired (often for order-dependent parameters), this makes a huge difference for performance.Both of these are simplistic examples, but it's easy to imagine more realistic cases where the patterns getting backtracked through contain multiple alternatives, subgroups, character classes, lookaheads, or all of these.
If, as #557 and #571 suggest, this is because find-and-replace uses JS's regex engine, then it needs to stop. Atom grammars use Oniguruma; so too must find-and-replace use a non-garbage regex engine.
The text was updated successfully, but these errors were encountered: